Book Home Java Enterprise in a Nutshell Search this book

Chapter 2. Java Syntax from the Ground Up


The Unicode Character Set
Identifiers and Reserved Words
Primitive Data Types
Expressions and Operators
Classes and Objects
Array Types
Reference Types
Packages and the Java Namespace
Java File Structure
Defining and Running Java Programs
Differences Between C and Java

This chapter is a terse but comprehensive introduction to Java syntax. It is written primarily for readers who are new to the language, but have at least some previous programming experience. Determined novices with no prior programming experience may also find it useful. If you already know Java, you should find it a useful language reference. In previous editions of this book, this chapter was written explicitly for C and C++ programmers making the transition to Java. It has been rewritten for this edition to make it more generally useful, but it still contains comparisons to C and C++ for the benefit of programmers coming from those languages.[1]

[1] Readers who want even more thorough coverage of the Java language should consider The Java Programming Language, Second Edition, by Ken Arnold and James Gosling (the creator of Java) (Addison Wesley Longman). And hard-core readers may want to go straight to the primary source: The Java Language Specification, by James Gosling, Bill Joy, and Guy Steele (Addison Wesley Longman). This specification is available in printed book form, but is also freely available for download from Sun's web site at I found both documents quite helpful while writing this chapter.

This chapter documents the syntax of Java programs by starting at the very lowest level of Java syntax and building from there, covering increasingly higher orders of structure. It covers:

The syntax of most programming languages is complex, and Java is no exception. In general, it is not possible to document all elements of a language without referring to other elements that have not yet been discussed. For example, it is not really possible to explain in a meaningful way the operators and statements supported by Java without referring to objects. But it is also not possible to document objects thoroughly without referring to the operators and statements of the language. The process of learning Java, or any language, is therefore an iterative one. If you are new to Java (or a Java-style programming language), you may find that you benefit greatly from working through this chapter and the next twice, so that you can grasp the interrelated concepts.

2.1. The Unicode Character Set

Java programs are written using the Unicode character set. Unlike the 7-bit ASCII encoding, which is useful only for English, and the 8-bit ISO Latin-1 encoding, which is useful only for major Western European languages, the 16-bit Unicode encoding can represent virtually every written language in common use on the planet. Very few text editors support Unicode, however, and in practice, most Java programs are written in plain ASCII. 16-bit Unicode characters are typically written to files using an encoding known as UTF-8, which converts the 16-bit characters into a stream of bytes. The format is designed so that plain ASCII and Latin-1 text are valid UTF-8 byte streams. Thus, you can simply write plain ASCII programs, and they will work as valid Unicode.

If you want to embed a Unicode character within a Java program that is written in plain ASCII, use the special Unicode escape sequence \uxxxx. That is, a backslash and a lowercase u, followed by four hexadecimal characters. For example, \u0020 is the space character, and \u3c00 is the character π. You can use Unicode characters anywhere in a Java program, including comments and variable names.

Library Navigation Links

Copyright © 2001 O'Reilly & Associates. All rights reserved.

This HTML Help has been published using the chm2web software.