String handling operations
A stringA sequence of characters often stored as a variable in a computer program. These characters can include numbers, letters and symbols. is a data type used to represent a sequence of one or more alphanumeric characters. These characters can be letters, numbers or symbols. It is usually possible to manipulate a string to provide information or to alter the contents of a string. Strings are shown in quotes (single or double), for example: 'Hello', or "World!".
A variableA memory location within a computer program where values are stored. can have string as its data type. For example:
wordOne = "Computer"
wordTwo = "Science"
Length
The length of a string can be determined using the pseudocode Also written as pseudo-code. A method of writing up a set of instructions for a computer program using plain English. This is a good way of planning a program before coding. LEN. This gives the length as an integerA whole number - this is one data type used to define numbers in a computer program. Integers can be unsigned (represent positive numbers) or signed (represent negative or positive numbers)..
LEN(wordOne)
- would give the answer "eight", as there are eight characters in the word "Computer"
Character position
It is possible to determine which character features at a position within a string:
wordOne[2]
- would give the answer "m", as m is the third character in the word 鈥淐omputer鈥 (remember computers start counting at zero).
Substring
A substring is a string of characters that exists inside a string, for example, the first three characters of a password.
The code below extracts a substring of the first three letters of a string in two ways:
Code version one:
wordOne 鈫 鈥淐omputer鈥
sub 鈫 鈥溾
FOR count 鈫 0 TO 2
sub 鈫 sub + wordOne[count]
ENDFOR
OUTPUT sub
Code version two:
wordOne 鈫 鈥淐omputer鈥
OUTPUT SUBSTRING(0, 2, wordOne)
Both versions would give 鈥淐om鈥, the first three characters in the string. However, using the SUBSTRING function saves having to write as much code each time a substring is needed.
The SUBSTRING function takes three parameters:
- an integer that indicates the starting point in the string
- an integer that indicates the end point within the string
- the string that the substring is to be taken from
Concatenation
concatenationThe joining together of two or more strings or substrings. is the joining together of two strings . In the following program, the variable 'fullName', is created by joining the 'firstName' and 'lastName' values together with a space between the strings.
firstName 鈫 鈥淏ob鈥
lastName 鈫 鈥淪mith鈥
fullName 鈫 firstName + " " + lastName
OUTPUT fullName
This would output the name Bob Smith
.
String conversion
Sometimes a programmer needs to change the data typeIn computer programming, data is divided up and organised according to type, eg numbers, characters and Boolean. stored within a variable. For example, an integer may need to be converted to a string in order to be displayed as part of a message. This process is known as castingChanging the data type of a variable.. The following examples convert a string to an integer and an integer to a string using Python:
str(68) returns 鈥68鈥
int(鈥54鈥) returns 54
AQA pseudo-code uses the following functions:
- STRING_TO_INT - string
- STRING_TO_REAL - string
- INT_TO_STRING - integer
- REAL_TO_STRING - real
Character set conversions
Another way of dealing with dataUnits of information. In computing there can be different data types, including integers, characters and Boolean. Data is often acted on by instructions. is to convert alphanumeric data into its numerical format in a character setA defined list of characters recognised by the computer.. The pseudo-code CHAR_TO_CODE and CODE_TO_CHAR functions allow characters to be swapped between their text and number formats using the ASCIIAmerican Standard Code for Information Interchange. A 7-bit character set used for representing English keyboard characters. or UnicodeA system of encoding text in computing widely used on the internet. character set. Read more about these character sets in the study guide.
CHAR_TO_CODE(character) - converts from alphanumeric character to ASCII/UNICODE character
CODE_TO_CHAR(integer) - converts from ASCII/UNICODE character to alphanumeric character
In the code below, the user enters a character. The code checks that this is between lower-case 鈥榓鈥 and 鈥榸鈥.
letter 鈫 USERINPUT
IF (CHAR_TO_CODE(character) >= CHAR_TO_CODE(鈥渁鈥))
AND
(CHAR_TO_CODE(character) <= CHAR_TO_CODE(鈥渮鈥))
THEN
OUTPUT 鈥淐haracter correct!鈥
ELSE
OUTPUT 鈥淚ncorrect鈥
ENDIF
In this example, CHAR_TO_CODE
has been used to create a range check for the data.
In contrast, CODE_TO_CHAR
takes an ASCII character and converts it to the alphanumeric character. This is useful when random letters need to be generated. For example, in the code below a random number is converted to a letter as part of a word guessing game.
letter 鈫 RANDOM_INT(97,122)
OUTPUT 鈥渄oes it contain鈥 + CODE_TO_CHAR(character) + 鈥?鈥