Signed overpunch

Summary

In computing, a signed overpunch is a coding scheme which stores the sign of a number by changing (usually) the last digit. It is used in character data on IBM mainframes by languages such as COBOL, PL/I, and RPG.[1] Its purpose is to save a character that would otherwise be used by the sign digit.[2] The code is derived from the Hollerith Punched Card Code, where both a digit and a sign can be entered in the same card column. It is called an overpunch because the digit in that column has a 12-punch or an 11-punch above it to indicate the sign. The top three rows of the card are called zone punches,[3] and so numeric character data which may contain overpunches is called zoned decimal.

In IBM terminology, the low-order four bits of a byte in storage are called the digit, and the high-order four bits are the zone.[4] The digit bits contain the numeric value 0–9. The zone bits contain either 'F'x, forming the characters 0–9, or the character position containing the overpunch contains a hexadecimal value indicating a positive or negative value, forming a different set of characters. (A, C, E, and F zones indicate positive values, B and D negative).

The PACK instruction on IBM System/360 architecture machines converts the sign of a zoned decimal number when converting to packed decimal, and the corresponding UNPK instruction will set the correct overpunched sign of its zoned decimal output.[5]

Language support edit

PL/I edit

PL/I uses the PICTURE attribute to declare zoned decimal data with a signed overpunch. Each character in a numeric picture except V, which indicates the position of the assumed decimal point, represents a digit. A picture character of T, I, or R indicates a digit position which may contain an overpunch. T indicates that the position will contain {–I if positive and {–R if negative. I indicates that the position will contain {–I if positive and 0-9 if negative. R indicates that the position will contain 0–9 if positive and {–R if negative.

For example PICTURE 'Z99R' describes a four-character numeric field. The first position may be blank or will contain a digit 0–9. The next two positions will contain digits, and the fourth position will contain 0–9 for a positive number and {–R for negative.[6]

Assigning the value 1021 to the above picture will store the characters "1021" in memory; assigning -1021 will store "102J".

COBOL edit

COBOL uses the picture character 'S' for USAGE IS DISPLAY data without SIGN IS SEPARATE CHARACTER to indicate an overpunch. SIGN IS LEADING indicates that the overpunch is over the first character of the field. SIGN IS TRAILING, locates it over the last character. SIGN IS TRAILING is the default.[7]

C/C++ edit

The C language has no provision for zoned decimal. The IBM ILE C/C++ compiler for System i provides functions for conversion between int or double and zoned decimal:[8]

  • QXXDTOZ() — Convert Double to Zoned Decimal
  • QXXITOZ() — Convert Integer to Zoned Decimal
  • QXXZTOD() — Convert Zoned Decimal to Double
  • QXXZTOI() — Convert Zoned Decimal to Integer

EBCDIC overpunch codes edit

EBCDIC
character
Digit Sign Card code[9]
{ 0 + 12-0
A 1 + 12-1
B 2 + 12-2
C 3 + 12-3
D 4 + 12-4
E 5 + 12-5
F 6 + 12-6
G 7 + 12-7
H 8 + 12-8
I 9 + 12-9
} 0 - 11-0
J 1 - 11-1
K 2 - 11-2
L 3 - 11-3
M 4 - 11-4
N 5 - 11-5
O 6 - 11-6
P 7 - 11-7
Q 8 - 11-8
R 9 - 11-9

Examples edit

10} is -100
45A is 451

ASCII representation edit

Representation of signed overpunch characters "is not standardized in ASCII, and different compilers use different overpunch codes." In some cases, "the representation is not the same as the result of converting an EBCDIC Signed field to ASCII with a translation table."[10] In other cases they are the same, to maintain source-data compatibility at the loss of the connection between the character code and the corresponding digit.

An EBCDIC negative field ending with the digit '1' will encode that digit as 'D1'x, upper-case 'J', where the digit is '1' and the zone is 'D' to indicate a negative field. ASCII upper-case 'J' is '4A'x, where the hexadecimal value bears no relationship to the numeric value. An alternative encoding uses lower-case 'q', '71'x, for this representation, where the digit is '1' and the zone is '7'. This preserves the digit and the collating sequence at the cost of having to recognize and translate fields with overpunches individually.

Examples edit

Gnu COBOL and MicroFocus COBOL use lower-case 'p' thru 'y' to represent negative '0' thru '9'.[11][12]

PL/I compilers on ASCII systems use the same set of characters ({, J–R) as EBCDIC to represent overpunches.[13]

References edit

  1. ^ IBM Corporation (June 1994). RPG/400 Reference (PDF). p. 403. Retrieved Aug 7, 2018.
  2. ^ "Tech Talk, COBOL Tutorials, EBCDIC to ASCII Conversion of Signed Fields". Retrieved 2008-03-15.
  3. ^ Van Overberghe, Jr., Albert G. (1987). Data Processing Technician Third Class. Naval Education and Training Program. pp. 3–8. Retrieved Jan 12, 2022.
  4. ^ IBM Corporation. IBM System/360 Principles of Operation (PDF). p. 34. Retrieved Jan 12, 2022.
  5. ^ IBM Corporation (Oct 2001). z/Architecture Principles of Operation (2nd ed.). pp. 7–112, 7–158. Retrieved August 7, 2018.
  6. ^ IBM Corporation (June 1995). IBM PL/I for MVS & VM Language Reference (PDF). pp. 294–296. Retrieved Aug 2, 2018.
  7. ^ IBM Corporation. "Enterprise COBOL for z/OS, V4.2, Language Reference". IBM Knowledge Center. Retrieved May 1, 2020.
  8. ^ IBM Corporation. "Library Functions". IBM Knowkedge Center. Retrieved May 1, 2020.
  9. ^ IBM Corporation (1989). System/370 Extended Architecture Reference Summary. p. 41.
  10. ^ "EBCDIC to ASCII Conversion of Signed Fields". DISC Media Conversion Specialists. Retrieved Nov 29, 2018.
  11. ^ "GnuCOBOL Programmer's Guide". SourceForge. Retrieved Jan 12, 2022.
  12. ^ "Micro Focus Visual COBOL 5.0 for Visual Studio 2019". Micro Focus. Retrieved Jan 12, 2022.
  13. ^ Kednos Corporation. "Kednos PL/I for OpenVMS Systems Reference Manual". Kednos.com. Retrieved Jan 12, 2022.