Newline

The term line break comes from the electronic text processing and indicates where in a text is to go from one line to the next.

  • 2.1 ASCII
  • 2.2 Unicode: more characters, highlight the line break
  • 2.3 Programming: Coding of upheaval
  • 2.4 Identification of unspecified or unwanted line breaks

General

On a typewriter, the line break is explicitly performed by pressing a button or a lever. It performs two functions:

  • Carriage return - positioning of the print position to beginning of line (far left).
  • Newline - positioning of the writing point by a line down

With the introduction of the telegraph various control characters are ( encodings of electrical signals ) is introduced to represent the newline function on a typewriter. These were then taken over by its use as a first output devices of the computer system, from telecommunications to electronic data processing.

Plain text files on the computer are similar in their representation on the screen initially a written on a typewriter text, the control characters are for the user generally invisible. Using the scroll bar, the relationship between the screen width and row length is lost, with the proportional fonts the. Between number of characters and line length Detailed functions have received ( text format and the like Rich) the characters for the line break then only in the markup.

Because the control characters were not specified in the initial stages of computer technology, they are with their functional changes remains one of the major incompatibilities between different operating systems and application software systems.

A distinction is made in the text formatting of text processing systems between a step change and a new-line, as well as between hard ( manual ) and soft ( automatic ) newline. The input methods and control characters listed below conform to the conventions of common word processing programs; However, different operation and representation depending on the system are possible.

More upheaval of the line arise both in the page change (whole page break ) and in the column set ( column break ).

In the printing industry, the wrapping of lines, including columns, pages, and to picture elements, graphics and the like Mettage is called. In the electronic data processing which takes on the word processing software: The more powerful it is, the more beautiful and readable is the image wrap.

Coding of the line break

ASCII

In developing the ASCII character set two characters are reserved:

  • The control character for the newline ( linefeed English, short LF) is encoded as ASCII character 10 (hexadecimal 0A). Some systems allow you to enter the LF character with the keyboard shortcut Ctrl J.
  • The control character for the carriage return (german carriage return, short CR) is encoded as ASCII character 13 (hexadecimal 0D). Some systems allow you to enter the CR character with the key combination Ctrl M.

There are different standards to encode the line break in a text file explicitly:

On IBM mainframes, the line break in the files is not a control character. Rather, the line length in DCB (record format F or FB) or in a length field at the beginning of the line (record format V or VB) is stored.

In Mac OS X can be found because of the extensive compatibility with its predecessor Mac OS few text formats which use CR instead of LF as line separator. Many modern Mac OS X programs can therefore deal with both formats in text files. When using wrongly declared files use the CR LF, this results in some programs means that line breaks are generated twice. Only files that originate from the BSD or Unix world, mostly mandatory bound to LF as line separator.

Unicode: more characters, highlight the line break

For Unicode texts calls the Unicode standard in the Unicode line breaking algorithm of software that is supposed to be unicode compliant, that are detected in addition to the other characters listed above and the following unicode compliant strings CR, LF, and CR LF as line breaks:

Programming: Coding of upheaval

Due to the different conventions for encoding the shapes of the line breaks on computer systems that arose on the acquisition of Fernschreib-/Schreibmaschinen-Konventionen in the electronic text processing, occur in the exchange between different systems problems.

A well known example is the printf ( ) or fprintf () from the standard C library for writing to files. The escape sequence \ n ( LF) is in C for a line break. When writing to files is distinguished in C between text mode and binary mode. For opened in text mode files a translation of \ n occurs in the usual on the respective system control characters for line breaks. Thus, in Unix-like operating system, no conversion takes place, since there already is LF for line breaks. In contrast, under Windows there will be a substitution with CR LF. The resulting files are therefore not identical. Is the file opened in binary mode, no translation, but it is always written a LF in the file.

In Java, the escape sequences \ n and \ r are available; a conversion will not happen, can instead by means of separate functions, the platform- dependent characters for the line break will be inserted. When reading the Java library is tolerant and accepts both CR, LF and CR LF as end of line. Other programming languages ​​such as Visual Basic or Perl provide similar functionality available to process text files correctly.

Many network protocols for the transmission of text, such as HTTP, SMTP, or FTP, define the sequence CR LF for line breaks. Some programs, such as mail transfer agents, are strictly and even refuse the processing of data with single LFs ( "bare LF" ). Other protocols, however, recommend a single LF as (possibly soft ) to interpret break.

Labeling unspecified or unwanted line breaks

A typographic break, which is suppressed, is used for example in poetry quotations at Zeilenzitation:

"I saz uf eime stones / and dahte leg with legs, / dar UF I pack t the elbow; [ ... ] "

Thus ( Virgel ) are approximately marks the rhymes, verses clearer paragraphs as you can then set with " / / ".

Conversely, it may be necessary in the electronic text processing, highlight an emerging line break as undesirable. This occurs, for example, in programming languages ​​in which the break is a control character, but also about when specifying URLs ( web addresses ). Here we used as " _ " ( underline ), " \ " (backslash ), whichever is not otherwise occupied in the respective format as control characters, or the characters as " ↩ " (U 21 A9). example:

The character " ↩ " here is a printtypografische statement " break ignore " - at Copy and paste the text passage about in the address bar of a browser, the part after the line break of some programs ignored, others add the web link back together, then have the sign " ↩ " to be removed manually - in the purely electronic medium, the character is rather disturbing.

In a review in the printing industry to use the correction marks "" for missing and "" for unwanted paragraph ( insert line breaks ', or remove line break ', ie, attach paragraph '):

791634
de