A string of bits in a computer need not represent a number. In fact, in practical applications, most computer input and output is alphanumeric. The most common type of alphanumeric data is "text", strings of characters from a character set. Each character is represented in the computer by a binary representation according to an established convention.
A "character set" is a set of characters, each represented with an n-bit pattern of bits. "Character" is used in a general sense, in that the character set may include n-bit representations of punctuation, symbols, or any other glyph used a written language. The character set may also include n-bit representations of control codes rather than characters to be displayed or printed. For example, one character might return the print head to the first column and another might advance to the next line. The n-bit representation of a character is often referred to as its "code point".
An example of a commonly used character set is one coded in accordance with the ASCII standard. In an ASCII coded character set, each of 128 different characters is represented with a unique 7-bit string.
There are many different character sets recognized by various computer system throughout the word. Some are different because they reflect a different written language, but even a single language may have different character sets. For example, ASCII has a number of character set variations. Different character sets may use different characters for the same meaning or the same character may have different meanings. For example, the letter "A" may have one code point in one character set and a different code point in another character set. Or, the same code point value might represent "A" in one character set and "a" in another.
To input text that is written with character set to a computer that is configured to decode a different character set, some sort of translation must occur. For example, if it is desired to display an "A" coded with a source character set on a computer that recognizes a different character set, the code point for "A" in the source character set must be translated to the code point for "A" in the character set of the computer.