Menu English Ukrainian russian Home

Free technical library for hobbyists and professionals Free technical library


Lecture notes, cheat sheets
Free library / Directory / Lecture notes, cheat sheets

Informatics and information technologies. Lecture notes: briefly, the most important

Lecture notes, cheat sheets

Directory / Lecture notes, cheat sheets

Comments on the article Comments on the article

Table of contents

  1. Introduction to Computer Science (Informatics. Information. Presentation and processing of information. Number systems. Representation of numbers in a computer. Formalized concept of an algorithm)
  2. Pascal language (Introduction to Pascal. Standard procedures and functions. Pascal operators)
  3. Procedures and functions (The concept of an auxiliary algorithm. Procedures in Pascal. Functions in Pascal. Anticipatory descriptions and connection of subroutines. Directive)
  4. Subroutines (Routine parameters. Subroutine parameter types. String type in Pascal. Procedures and functions for string type variables. Records. Sets)
  5. Files (Files. File operations. Modules. Types of modules)
  6. Dynamic memory (Reference data type. Dynamic memory. Dynamic variables. Working with dynamic memory. Untyped pointers)
  7. Abstract data structures (Abstract data structures. Stacks. Queues)
  8. Tree Data Structures (Tree data structures. Operations on trees. Examples of implementation of operations)
  9. Counts (The concept of a graph. Methods of representing a graph. Representation of a graph by a list of incidences. Depth-first traversal algorithm for a graph. Representation of a graph as a list of lists. Breadth-first traversal algorithm for a graph)
  10. Object data type (Object type in Pascal. The concept of an object, its description and use. Inheritance. Creating instances of objects. Components and scope)
  11. Methods (Methods. Constructors and destructors. Destructors. Virtual methods. Object data fields and formal method parameters)
  12. Object Type Compatibility (Encapsulation. Extensible objects. Object type compatibility)
  13. Assembler (About assembler. Microprocessor software model. User registers. General purpose registers. Segment registers. Status and control registers)
  14. Registers (Microprocessor system registers. Control registers. System address registers. Debug registers)
  15. Assembly programs (Assembler program structure. Assembler syntax. Comparison operators. Operators and their precedence. Simplified segment definition directives. Identifiers created by the MODEL directive. Memory models. Memory model modifiers)
  16. Assembly Instruction Structures (Structure of a machine instruction. Methods for specifying instruction operands. Addressing methods)
  17. commands (Data transfer commands. Arithmetic commands)
  18. Control Transfer Commands (Logical commands. Truth table for logical negation. Truth table for logical inclusive OR. Truth table for logical AND. Truth table for logical exclusive OR. Meaning of abbreviations in the jcc command name. List of conditional jump commands for the command. Conditional jump commands and flags)

LECTURE No. 1. Introduction to computer science

1. Computer science. Information. Representation and processing of information

Informatics is engaged in a formalized representation of objects and structures of their relationships in various fields of science, technology, and production. Various formal tools are used to model objects and phenomena, such as logical formulas, data structures, programming languages, etc.

In computer science, such a fundamental concept as information has various meanings:

1) formal presentation of external forms of information;

2) abstract meaning of information, its internal content, semantics;

3) relation of information to the real world.

But, as a rule, information is understood as its abstract meaning - semantics. By interpreting the representation of information, we get its meaning, semantics. Therefore, if we want to exchange information, we need consistent views so that the correctness of interpretation is not violated. To do this, the interpretation of the representation of information is identified with some mathematical structures. In this case, information processing can be performed by rigorous mathematical methods.

One of the mathematical descriptions of information is its representation in the form of a function y =f(x, t), where t is time, x is a point in a certain field at which the value of y is measured. Depending on the parameters of the chi function (information can be classified.

If the parameters are scalar quantities that take on a continuous series of values, then the information obtained in this way is called continuous (or analog). If the parameters are given a certain change step, then the information is called discrete. Discrete information is considered universal, since for each specific parameter it is possible to obtain a function value with a given degree of accuracy.

Discrete information is usually identified with digital information, which is a special case of symbolic information of alphabetic representation. An alphabet is a finite set of symbols of any nature. Very often in computer science a situation arises when the characters of one alphabet must be represented by the characters of another, i.e., an encoding operation must be performed. If the number of characters of the encoding alphabet is less than the number of characters of the encoding alphabet, then the encoding operation itself is not complicated, otherwise it is necessary to use a fixed set of characters of the encoding alphabet for unambiguous correct encoding.

As practice has shown, the simplest alphabet that allows you to encode other alphabets is binary, consisting of two characters, which are usually denoted by 0 and 1. Using n characters of the binary alphabet, you can encode 2n characters, and this is enough to encode any alphabet.

The value that can be represented by a symbol of the binary alphabet is called the minimum unit of information or bit. Sequence of 8 bits - bytes. An alphabet containing 256 different 8-bit sequences is called a byte alphabet.

As a standard today in computer science, a code is adopted in which each character is encoded by 1 byte. There are other alphabets as well.

2. Number systems

A number system is a set of rules for naming and writing numbers. There are positional and non-positional number systems.

The number system is called positional if the value of the digit of the number depends on the location of the digit in the number. Otherwise, it is called nonpositional. The value of a number is determined by the position of these digits in the number.

3. Representation of numbers in a computer

32-bit processors can work with up to 232-1 RAM, and addresses can be written in the range 00000000 - FFFFFFFF. However, in real mode, the processor operates with memory up to 220-1, and addresses fall in the range 00000 - FFFFF. Bytes of memory can be combined into fields of both fixed and variable length. A word is a fixed length field consisting of 2 bytes, a double word is a field of 4 bytes. Field addresses can be even or odd, with even addresses performing operations faster.

Fixed-point numbers are represented in computers as integer binary numbers, and their size can be 1, 2, or 4 bytes.

Binary integers are represented in two's complement, and fixed-point numbers are represented in two's complement. Moreover, if a number occupies 2 bytes, then the structure of the number is written according to the following rule: the most significant digit is allocated to the sign of the number, and the rest - to the binary digits of the number. The complementary code of a positive number is equal to the number itself, and the complementary code of a negative number can be obtained using the following formula: x = 10i - \x\, where n is the digit capacity of the number.

In the binary number system, an additional code is obtained by inverting bits, i.e., replacing units with zeros and vice versa, and adding one to the least significant bit.

The number of bits of the mantissa determines the precision of the representation of numbers, the number of machine order bits determines the range of representation of floating point numbers.

4. Formalized concept of an algorithm

An algorithm can only exist if, at the same time, some mathematical object exists. The formalized concept of an algorithm is connected with the concept of recursive functions, normal Markov algorithms, Turing machines.

In mathematics, a function is called single-valued if, for any set of arguments, there is a law by which the unique value of the function is determined. An algorithm can act as such a law; in this case the function is said to be computable.

Recursive functions are a subclass of computable functions, and the algorithms that define the computation are called companion recursive function algorithms. First, the basic recursive functions are fixed, for which the accompanying algorithm is trivial, unambiguous; then three rules are introduced - substitution, recursion and minimization operators, with the help of which more complex recursive functions are obtained on the basis of basic functions.

The basic functions and their accompanying algorithms can be:

1) a function of n independent variables, identically equal to zero. Then, if the sign of the function is φn, then regardless of the number of arguments, the value of the function should be set equal to zero;

2) the identity function of n independent variables of the form ψni. Then, if the sign of the function is ψni, then the value of the function should be taken as the value of the i-th argument, counting from left to right;

3) Λ is a function of one independent argument. Then, if the sign of the function is λ, then the value of the function should be taken as the value following the value of the argument. Different scholars have proposed their own approaches to the formalized

representation of the algorithm. For example, the American scientist Church suggested that the class of computable functions is exhausted by recursive functions and, as a result, whatever the algorithm that processes one set of non-negative integers into another, there is an algorithm accompanying the recursive function that is equivalent to the given one. Therefore, if it is impossible to construct a recursive function to solve a given problem, then there is no algorithm for solving it. Another scientist, Turing, developed a virtual computer that processed an input sequence of characters into an output. In this regard, he put forward the thesis that any computable function is Turing computable.

LECTURE No. 2. Pascal language

1. Introduction to the Pascal language

The basic symbols of the language - letters, numbers and special characters - make up its alphabet. The Pascal language includes the following set of basic symbols:

1) 26 Latin lowercase and 26 Latin uppercase letters:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

abcdefghijklmnopqrstuvwxyz;

2) _ (underscore);

3) 10 digits: 0123456789;

4) signs of operations:

+ - x / = <> = := @;

5) limiters:

., ' ( ) [ ] (..) { } (* *).. : ;

6) specifiers: ^ # $;

7) service (reserved) words:

ABSOLUTE, ASSEMBLER, AND, ARRAY, ASM, BEGIN, CASE, CONST, CONSTRUCTOR, DESTRUCTOR, DIV, DO, DOWNTO, ELSE, END, EXPORT, EXTERNAL, FAR, FILE, FOR, FORWARD, FUNCTION, GOTO, IF, IMPLEMENTATION, IN, INDEX, INHERITED, INLINE, INTERFACE, INTERRUPT, LABEL, LIBRARY, MOD, NAME, NIL, NEAR, NOT, OBJECT, OF, OR, PACKED, PRIVATE, PROCEDURE, PROGRAM, PUBLIC, RECORD, REPEAT, RESIDENT, SET, SHL, SHR, STRING, THEN, TO, TYPE, UNIT, UNTIL, USES, VAR, VIRTUAL, WHILE, WITH, XOR.

In addition to those listed, the set of basic characters includes a space. Spaces cannot be used inside double characters and reserved words.

Type concept for data

It is customary in mathematics to classify variables according to some important characteristics. A strict distinction is made between real, complex and logical variables, between variables representing individual values ​​and a set of values, etc. When processing data on a computer, such a classification is even more important. In any algorithmic language, every constant, variable, expression, or function is of a particular type.

There is a rule in Pascal: the type is explicitly specified in the declaration of a variable or function that precedes its use. The Pascal type concept has the following main properties:

1) any data type defines a set of values ​​to which a constant belongs, which a variable or expression can take, or an operation or function can produce;

2) the type of value given by a constant, variable or expression can be determined by their form or description;

3) each operation or function requires fixed type arguments and produces a fixed type result.

It follows that the compiler can use type information to check the computability and correctness of various constructs.

The type defines:

1) possible values ​​of variables, constants, functions, expressions belonging to a given type;

2) the internal form of data presentation in a computer;

3) operations and functions that can be performed on values ​​belonging to a given type.

It should be noted that the mandatory description of the type leads to redundancy in the text of programs, but such redundancy is an important auxiliary tool for developing programs and is considered as a necessary property of modern high-level algorithmic languages.

There are scalar and structured data types in Pascal. Scalar types include standard types and user-defined types. Standard types include integer, real, character, boolean, and address types.

Integer types define constants, variables, and functions whose values ​​are realized by the set of integers allowed in a given computer.

Real types defines those data that are implemented by a subset of real numbers that are allowed in a given computer.

The user-defined types are enum and range. Structured types come in four flavors: arrays, sets, records, and files.

In addition to those listed, Pascal includes two more types - procedural and object.

A language expression consists of constants, variables, function pointers, operator signs, and brackets. An expression defines a rule for calculating some value. The order of calculation is determined by the precedence (priority) of the operations contained in it. Pascal has the following operator precedence:

1) calculations in parentheses;

2) calculation of function values;

3) unary operations;

4) operations *, /, div, mod, and;

5) operations +, -, or, xor;

6) relational operations =, <>, <, >, <=, >=.

Expressions are part of many Pascal language operators and can also be arguments to built-in functions.

2. Standard procedures and functions

Arithmetic functions

1.Function Abs(X);

Returns the absolute value of the parameter.

X is an expression of real or integer type.

2. Function ArcTan(X: Extended): Extended;

Returns the arc tangent of the argument.

X is an expression of real or integer type.

3. Function exp(X: Real): Real;

Returns the exponent.

X is an expression of real or integer type.

4.Frac(X: Real): Real;

Returns the fractional part of the argument.

X is a real type expression. The result is the fractional part of X, i.e.

Frac(X) = X-Int(X).

5. Function Int(X: Real): Real;

Returns the integer part of the argument.

X is a real type expression. The result is the integer part of X, i.e. X rounded towards zero.

6. Function Ln(X: Real): Real;

Returns the natural logarithm (Ln e = 1) of a real-type expression X.

7.Function Pi: Extended;

Returns the Pi value, which is defined as 3.1415926535.

8.Function Sin(X: Extended): Extended;

Returns the sine of the argument.

X is a real type expression. Sin returns the sine of angle X in radians.

9.Function Sqr(X: Extended): Extended;

Returns the square of the argument.

X is a floating point expression. The result is of the same type as X.

10.Function Sqrt(X: Extended): Extended;

Returns the square root of the argument.

X is a floating point expression. The result is the square root of X.

Value Conversion Procedures and Functions

1. Procedure Str(X [: Width [: Decimals]]; var S);

Converts the number X to a string representation according to

Width and Decimals formatting options. X is an expression of a real or integer type. Width and Decimals are integer type expressions. S is a variable of type String or a null-terminated character array if the extended syntax is allowed.

2. Function Chr(X: Byte): Char;

Returns the character with ordinal X in the ASCII table.

3.Function High(X);

Returns the largest value in the range of the parameter.

4.FunctionLow(X);

Returns the smallest value in the parameter range.

5 FunctionOrd(X): Longint;

Returns the ordinal value of an enumerated type expression. X is an enumerated type expression.

6. Function Round(X: Extended): Longint;

Rounds a real value to an integer. X is a real type expression. Round returns a Longint value, which is the value of X rounded to the nearest whole number. If X is exactly halfway between two integers, the number with the largest absolute value is returned. If the rounded value of X is outside the Longint range, a run-time error is generated that you can handle using the EInvalidOp exception.

7. Function Trunc(X: Extended): Longint;

Truncates a real type value to an integer. If the rounded value of X is outside the Longint range, a run-time error is generated that you can handle using the EInvalidOp exception.

8. Procedure Val(S; var V; var Code: Integer);

Converts a number from a string value S to a number

representation V. S - string type expression - a sequence of characters that forms an integer or real number. If the S expression is invalid, the index of the invalid character is stored in the Code variable. Otherwise Code is set to zero.

Ordinal Value Procedures and Functions

1. Procedure Dec(varX [; N: LongInt]);

Subtracts one or N from the variable X. Dec(X) corresponds to X:= X - 1, and Dec(X, N) corresponds to X:= X - N. X is a variable of an enumerated type, or of type PChar if the extended syntax is allowed, and N is an expression of integer type. The Dec procedure generates optimal code and is especially useful in long loops.

2. Procedure Inc(varX [; N: LongInt]);

Adds one or N to the variable X. X is a variable of enumerated type or PChar type if the extended syntax is allowed, and N is an expression of integral type. Inc (X) matches the instruction X:= X + 1, and Inc (X, N) matches the instruction X:= X + N. The Inc procedure generates optimal code and is especially useful in long loops.

3. FunctionOdd(X: LongInt): Boolean;

Returns True if X is an odd number, False otherwise.

4.Function Pred(X);

Returns the previous value of the parameter. X is an enumerated type expression. The result is of the same type.

5 Function Succ(X);

Returns the next parameter value. X is an enumerated type expression. The result is of the same type.

3. Pascal language operators

Conditional operator

The format of the full conditional statement is defined as follows: If B then SI else S2; where B is a branching condition (decision making), a logical expression or a relation; SI, S2 - one executable statement, simple or compound.

When executing a conditional statement, first the expression B is evaluated, then its result is analyzed: if B is true, then the statement S1 is executed - the branch of then, and the statement S2 is skipped; if B is false, then statement S2 - the else branch is executed, and statement S1 is skipped.

There is also an abbreviated form of the conditional operator. It is written as: If B then S.

Selection operator

The operator structure is as follows:

case S of

c1: instruction1;

c2: instruction2;

...

cn: instructionN;

else instruction

end;

where S is an ordinal type expression whose value is being calculated;

с1, с2..., сп - constants of ordinal type with which expressions are compared

S; instruction1,..., instructionN - operators of which the one whose constant matches the value of the expression S is executed;

instruction - a statement that is executed if the value of the Sylq expression matches none of the constants c1, c2.... cn.

This operator is a generalization of the conditional If operator for an arbitrary number of alternatives. There is an abbreviated form of the statement where there is no else branch.

Loop statement with parameter

Parameter loop statements that begin with the word for cause the statement, which can be a compound statement, to be repeatedly executed while the control variable is assigned an ascending sequence of values.

General view of the for operator:

for <loop counter> := <start value> to <end value> do <statement>;

When the for statement starts executing, the start and end values ​​are determined once, and these values ​​are retained throughout the execution of the for statement. The statement contained in the body of the for statement is executed once for each value in the range between the start and end values. The loop counter is always initialized to an initial value. When the for statement is running, the value of the loop counter is incremented with each iteration. If the start value is greater than the end value, then the statement contained in the body of the for statement is not executed. When the downto keyword is used in a loop statement, the value of the control variable is decremented by one on each iteration. If the start value in such a statement is less than the end value, then the statement contained in the body of the loop statement is not executed.

If the statement contained in the body of the for statement changes the value of the loop counter, then this is an error. After the execution of the for statement, the value of the control variable becomes undefined, unless the execution of the for statement was interrupted by a jump statement.

Loop statement with precondition

A precondition loop statement (beginning with the while keyword) contains an expression that controls the repeated execution of the statement (which can be a compound statement). Cycle shape:

While B do S;

where B is a logical condition, the truth of which is checked (it is a condition for terminating the loop);

S - loop body - one statement.

The expression that controls the repetition of a statement must be of type Boolean. It is evaluated before the inner statement is executed. The inner statement is executed repeatedly as long as the expression evaluates to True. If the expression evaluates to False from the start, then the statement contained within the precondition loop statement is not executed.

Loop statement with postcondition

In a loop statement with a postcondition (beginning with the word repeat), the expression that controls the repeated execution of a sequence of statements is contained within the repeat statement. Cycle shape:

repeat S until B;

where B is a logical condition, the truth of which is checked (it is a condition for terminating the loop);

S - one or more loop body statements.

The result of the expression must be of a boolean type. The statements enclosed between the repeat and until keywords are executed sequentially until the result of the expression evaluates to True. The statement sequence will be executed at least once because the expression is evaluated after each execution of the statement sequence.

LECTURE № 3. Procedures and functions

1. The concept of an auxiliary algorithm

An algorithm for solving a problem is designed by decomposing the entire problem into separate subtasks. Typically, subtasks are implemented as subroutines.

A subroutine is some auxiliary algorithm that is repeatedly used in the main algorithm with different values ​​of some incoming quantities, called parameters.

A subroutine in programming languages ​​is a sequence of statements that are defined and written in only one place in the program, but they can be called for execution from one or more points in the program. Each subroutine is identified by a unique name.

There are two types of subroutines in Pascal, procedures and functions. A procedure and a function are a named sequence of declarations and statements. When using procedures or functions, the program must contain the text of the procedure or function and a call to the procedure or function. The parameters specified in the description are called formal, those specified in the call to the subroutine are called actual. All formal parameters can be divided into the following categories:

1) parameters-variables;

2) constant parameters;

3) parameter-values;

4) procedure parameters and function parameters, i.e. procedural type parameters;

5) untyped variable parameters.

The texts of procedures and functions are placed in the section of descriptions of procedures and functions.

Passing procedure and function names as parameters

In many problems, especially in computational mathematics, it is necessary to pass the names of procedures and functions as parameters. To do this, TURBO PASCAL introduced a new data type - procedural or functional, depending on what is described. (Procedural and function types are described in the type declaration section.)

A function and procedural type is defined as the heading of a procedure and a function with a list of formal parameters but no name. It is possible to define a function or procedural type without parameters, for example:

type

Proc = procedure;

After declaring a procedural or functional type, it can be used to describe formal parameters - the names of procedures and functions. In addition, it is necessary to write those real procedures or functions whose names will be passed as actual parameters.

2. Procedures in Pascal

Each procedure description contains a header followed by a program block. The general form of the procedure header is as follows:

Procedure <name> [(<list of formal parameters>)];

A procedure is activated with a procedure statement that contains the name of the procedure and the required parameters. The statements to be executed when the procedure is run are contained in the statement part of the procedure module. If a statement contained in a procedure uses a procedure identifier inside a procedure module, then the procedure will be executed recursively, that is, it will refer to itself when executed.

3. Functions in Pascal

A function declaration defines the part of the program in which the value is calculated and returned. The general form of the function header is as follows:

Function <name> [(<list of formal parameters>)]: <return type>;

The function is activated when it is called. When a function is called, the function identifier and any parameters necessary for its evaluation are specified. A function call can be included in expressions as an operand. When the expression is evaluated, the function is executed and the value of the operand becomes the value returned by the function.

The operator part of the function block specifies the statements that must be executed when the function is activated. A module must contain at least one assignment statement that assigns a value to a function identifier. The result of the function is the last value assigned. If there is no such assignment statement, or if it has not been executed, then the return value of the function is undefined.

If a function identifier is used when calling a function within a module, then the function is executed recursively.

4. Forward descriptions and connection of subroutines. Directive

A program may contain several subroutines, i.e. the structure of the program may be complicated. However, these subroutines can be at the same nesting level, so the subroutine declaration must come first, and then the call to it, unless a special forward declaration is used.

A procedure declaration that contains a forward directive instead of a statement block is called a forward declaration. At some point after this declaration, a procedure must be defined by means of a defining declaration. A defining declaration is one that uses the same procedure identifier but omits the list of formal parameters and includes a statement block. The forward declaration and the defining declaration must appear in the same part of the procedure and function declarations. Between them, other procedures and functions can be declared that can refer to the forward-declaration procedure. Thus, mutual recursion is possible.

The forward description and the defining description are the complete description of the procedure. The procedure is considered to be described using forward description.

If the program contains quite a lot of subroutines, then the program will cease to be visual, it will be difficult to navigate in it. To avoid this, some subroutines are stored as source files on disk, and if necessary, they are connected to the main program at the compilation stage using a compilation directive.

A directive is a special comment that can be placed anywhere in a program, where a normal comment can be. However, they differ in that the directive has a special notation: immediately after the closing bracket without a space, the sign S is written, and then, again without a space, the directive is indicated.

Example

1) {SE+} - emulate math coprocessor;

2) {SF+} - form a distant type of procedure and function call;

3) {SN+} - use math coprocessor;

4) {SR+} - check if the ranges are out of bounds.

Some compilation switches may contain a parameter, for example:

{$1 file name} - include the named file in the text of the compiled program.

LECTURE No. 4. Subroutines

1. Subprogram parameters

The description of a procedure or function specifies a list of formal parameters. Each parameter declared in a formal parameter list is local to the described procedure or function, and can be referenced in the module associated with that procedure or function by its identifier.

There are three types of parameters: value, variable, and untyped variable. They are characterized as follows.

1. A group of parameters without a preceding keyword is a list of value parameters.

2. A group of parameters preceded by the const keyword and followed by a type is a list of constant parameters.

3. A group of parameters preceded by the var keyword and followed by a type is a list of untyped variable parameters.

4. A group of parameters preceded by the var or const keyword and not followed by a type is a list of untyped variable parameters.

2. Types of subroutine parameters

Value parameters

A formal value parameter is treated as a variable local to the procedure or function, except that it derives its initial value from the corresponding actual parameter when the procedure or function is invoked. Changes that a formal value parameter undergoes do not affect the value of the actual parameter. The corresponding actual value of the value parameter must be an expression, and its value must not be a file type or any structure type containing a file type.

The actual parameter must be of a type that is assignment compatible with the type of the formal value parameter. If the parameter is of type string, then the formal parameter will have a size attribute of 255.

Constant Parameters

Formal constant parameters work similarly to a read-only local variable that gets its value when a procedure or function is invoked from the corresponding actual parameter. Assignments to a formal constant parameter are not allowed. A formal constant parameter also cannot be passed as an actual parameter to another procedure or function. A constant parameter corresponding to an actual parameter in a procedure or function statement must follow the same rules as the actual parameter value.

In cases where a formal parameter does not change its value during the execution of a procedure or function, a constant parameter should be used instead of a value parameter. Constant parameters allow the implementation of a procedure or function to protect against accidental assignments to a formal parameter. In addition, for struct and string type parameters, the compiler can generate more efficient code when used instead of value-parameters for constant-parameters.

Variable parameters

A variable parameter is used when a value must be passed from a procedure or function to the calling program. The corresponding actual parameter in a procedure or function call statement must be a variable reference. When a procedure or function is invoked, the formal parameter-variable is replaced by the actual variable, any changes in the value of the formal parameter-variable are reflected in the actual parameter.

Within a procedure or function, any reference to a formal variable parameter results in access to the actual parameter itself. The type of the actual parameter must match the type of the formal variable parameter, but this restriction can be circumvented by using an untyped variable parameter).

Untyped Parameters

When the formal parameter is an untyped variable parameter, then the corresponding actual parameter can be any reference to a variable or constant, regardless of its type. An untyped parameter declared with the var keyword can be modified, while an untyped parameter declared with the const keyword is read-only.

In a procedure or function, an untyped variable parameter has no type, i.e., it is incompatible with variables of all types until it is given a specific type by variable type assignment.

Although untyped parameters provide more flexibility, there are some risks associated with using them. The compiler cannot check the validity of operations on untyped variables.

Procedural Variables

After defining a procedural type, it becomes possible to describe variables of this type. Such variables are called procedural variables. Like an integer variable that can be assigned a value of an integer type, a procedural variable can be assigned a value of a procedural type. Such a value could, of course, be another procedure variable, but it could also be a procedure or function identifier. In this context, the declaration of a procedure or function can be viewed as a description of a special kind of constant whose value is the procedure or function.

As with any other assignment, the values ​​of the variable on the left side and on the right side must be assignment compatible. Procedural types, to be assignment compatible, must have the same number of parameters, and the parameters in the corresponding positions must be of the same type. Parameter names in a procedural type declaration have no effect.

In addition, to ensure assignment compatibility, a procedure or function, if it is to be assigned to a procedure variable, must satisfy the following requirements:

1) it should not be a standard procedure or function;

2) such a procedure or function cannot be nested;

3) such a procedure must not be an inline procedure;

4) it must not be an interrupt procedure.

Standard procedures and functions are the procedures and functions described in the System module, such as Writeln, Readln, Chr, Ord. Nested procedures and functions with procedural variables cannot be used. A procedure or function is considered nested when it is declared within another procedure or function.

The use of procedural types is not limited to just procedural variables. Like any other type, a procedural type can participate in the declaration of a structural type.

When a procedure variable is assigned the value of a procedure, what happens at the physical layer is that the address of the procedure is stored in the variable. In fact, a procedure variable is very similar to a pointer variable, only instead of referring to data, it points to a procedure or function. Like a pointer, a procedural variable occupies 4 bytes (two words) that contains a memory address. The first word stores the offset, the second word stores the segment.

Procedural Type Parameters

Since procedural types can be used in any context, it is possible to describe procedures or functions that take procedures and functions as parameters. Procedural type parameters are especially useful when you need to perform some common action on multiple procedures or functions.

If a procedure or function is to be passed as a parameter, it must follow the same type compatibility rules as the assignment. That is, such procedures or functions must be compiled with the far directive, they cannot be built-in functions, they cannot be nested, and they cannot be described with the inline or interrupt attributes.

LECTURE #5. String data type

1. String type in Pascal

A sequence of characters of a certain length is called a string. Variables of string type are defined by specifying the name of the variable, the reserved word string, and optionally, but not necessarily, specifying the maximum size, i.e., the length of the string, in square brackets. If you do not set the maximum string size, then by default it will be 255, i.e. the string will consist of 255 characters.

Each element of a string can be referred to by its number. However, strings are input and output as a whole, not element by element, as is the case with arrays. The number of characters entered must not exceed that specified in the maximum string size, so if such an excess occurs, then the "extra" characters will be ignored.

2. Procedures and functions for string type variables

1. Function Copy(S: String; Index, Count: Integer): String;

Returns a substring of a string. S is an expression of type String.

Index and Count are integer type expressions. The function returns a string containing Count characters starting at the Index position. If Index is greater than the length of S, the function returns an empty string.

2. Procedure Delete(var S: String; Index, Count: Integer);

Removes a substring of characters of length Count from string S, starting at position Index. S is a variable of type String. Index and Count are integer type expressions. If Index is greater than the length of S, no characters are removed.

3. Procedure Insert(Source: String; var S: String; Index: Integer);

Concatenates a substring into a string, starting at a specified position. Source is an expression of type String. S is a variable of type String of any length. Index is an expression of integer type. Insert inserts Source into S, starting at position S[Index].

4. Function Length(S: String): Integer;

Returns the number of characters actually used in string S. Note that when using null-terminated strings, the number of characters is not necessarily equal to the number of bytes.

5. Function Pos(Substr: String; S: String): Integer;

Searches for a substring in a string. Pos looks for Substr within S and returns an integer value that is the index of the first character of Substr within S. If Substr is not found, Pos returns null.

3. Recordings

A record is a collection of a limited number of logically related components belonging to different types. The components of a record are called fields, each of which is identified by a name. A record field contains the name of the field, followed by a colon to indicate the type of the field. Record fields can be of any type allowed in Pascal, with the exception of the file type.

The description of a record in the Pascal language is carried out using the service word RECORD, followed by the description of the components of the record. The description of the entry ends with the service word END.

For example, a notebook contains last names, initials, and phone numbers, so it is convenient to represent a separate line in a notebook as the following entry:

type Row = Record

FIO: String[20];

TEL: String[7];

end;

var str: Row;

Record descriptions are also possible without using the type name, for example:

var str : Record

FIO : String[20];

TEL : String[7];

end;

Referencing a record as a whole is allowed only in assignment statements where record names of the same type are used to the left and right of the assignment sign. In all other cases, separate fields of records are operated. To refer to an individual record component, you must specify the record name and, through a dot, specify the name of the desired field. Such a name is called a compound name. A record component can also be a record, in which case the distinguished name will contain not two, but more names.

Referencing record components can be simplified by using the with append operator. It allows you to replace the compound names that characterize each field with just field names, and define the record name in the join statement.

Sometimes the content of an individual record depends on the value of one of its fields. In the Pascal language, a record description is allowed, consisting of a common and variant parts. The variant part is specified using the case P of construct, where P is the name of the field from the common part of the record. The possible values ​​accepted by this field are listed in the same way as in the variant statement. However, instead of specifying the action to perform, as is done in a variant statement, the variant fields are specified in parentheses. The description of the variant part ends with the service word end. The field type P can be specified in the heading of the variant part. Records are initialized using typed constants.

4. Sets

The concept of a set in the Pascal language is based on the mathematical concept of sets: it is a limited collection of different elements. An enumerated or interval data type is used to construct a concrete set type. The type of elements that make up a set is called the base type.

A multiple type is described using the Set of function words, for example:

type M = Set of B;

Here M is the plural type, B is the base type.

The belonging of variables to a plural type can be determined directly in the variable declaration section.

Set type constants are written as a bracketed sequence of base type elements or intervals, separated by commas. A constant of the form [] means an empty subset.

A set includes a set of elements of the base type, all subsets of the given set, and the empty subset. If the base type on which the set is built has K elements, then the number of subsets included in this set is equal to 2 to the power of K. The order of listing the elements of the base type in constants is indifferent. The value of a variable of a multiple type can be given by a construction of the form [T], where T is a variable of the base type.

Assignment (:=), union (+), intersection (*), and subtraction (-) operations are applicable to variables and constants of a set type. The result of these operations is a value of the plural type:

1) ['A','B'] + ['A','D'] will give ['A','B','D'];

2) ['A'] * ['A','B','C'] will give ['A'];

3) ['A','B','C'] - ['A','B'] will give ['C'].

The following operations are applicable to multiple values: identity (=), non-identity (<>), contained in (<=), contains (>=). The result of these operations has a boolean type:

1) ['A','B'] = ['A','C'] will give FALSE ;

2) ['A','B'] <> ['A','C'] will give TRUE;

3) ['B'] <= ['B','C'] will give TRUE;

4) ['C','D'] >= ['A'] will give FALSE.

In addition to these operations, to work with values ​​of a set type, the in operation is used, which checks whether the element of the base type to the left of the operation sign belongs to the set to the right of the operation sign. The result of this operation is a boolean. The operation of checking whether an element belongs to a set is often used instead of relational operations.

When multiple data types are used in programs, operations are performed on bit strings of data. Each value of the multiple type in the computer memory corresponds to one binary digit.

Values ​​of a multiple type cannot be elements of an I/O list. In each concrete implementation of the compiler from the Pascal language, the number of elements of the base type on which the set is built is limited.

The initialization of multiple type values ​​is done using typed constants.

Here are some procedures for working with sets.

1. Procedure Exclude(var S: Set of T; I:T);

Removes element I from set S. S is a variable of type "set" and I is an expression of a type compatible with the original type of S. Exclude(S, I) is the same as S : = S - [I], but generates more efficient code.

2. Procedure Include(var S: Set of T; I:T);

Adds an element I to the set S. S is a variable of type "set" and I is an expression of a type compatible with the type S. The Include(S, I) construct is the same as S : = S + [I], but generates more efficient code.

LECTURE No. 6. Files

1. Files. File operations

The introduction of the file type into the Pascal language is caused by the need to provide the ability to work with peripheral (external) computer devices designed for input, output and data storage.

The file data type (or file) defines an ordered collection of an arbitrary number of components of the same type. The common property of an array, set, and record is that the number of their components is determined at the stage of writing the program, while the number of file components in the program text is not determined and can be arbitrary.

When working with files, I / O operations are performed. An input operation means transferring data from an external device (from an input file) to the main memory of a computer, an output operation is a transfer of data from the main memory to an external device (to an output file). Files on external devices are often referred to as physical files. Their names are determined by the operating system.

In Pascal programs, filenames are specified using strings. To work with files in the program, you must define a file variable. Pascal supports three file types: text files, component files, untyped files.

File variables that are declared in a program are called logical files. All the basic procedures and functions that provide data I/O work only with logical files. The physical file must be associated with the logical file before the file opening procedures can be performed.

Text files

A special place in the Pascal language is occupied by text files, the components of which are of character type. To describe text files, the language defines the standard type Text:

var TF1, TF2: Text;

Text files are a sequence of lines, and lines are a sequence of characters. Lines are of variable length, each line ends with a line terminator.

Component Files

A component or typed file is a file with the declared type of its components. Component files consist of machine representations of variable values; they store data in the same form as computer memory.

The description of the file type values ​​is:

type M = File Of T;

where M is the name of the file type;

T - component type.

File components can be all scalar types, and from structured types - arrays, sets, records. In almost all specific implementations of the Pascal language, the "file of files" construct is not allowed.

All operations on component files are performed using standard procedures.

Write(f,X1,X2,...XK)

Untyped files

Untyped files allow you to write arbitrary sections of computer memory to disk and read them from disk to memory. Untyped files are described as follows:

var f: File;

Now we list the procedures and functions for working with different types of files.

1. Procedure Assign(var F; FileName: String);

The AssignFile procedure maps an external file name to a file variable.

F is a file variable of any file type, FileName is a String expression, or a PChar expression if extended syntax is allowed. All further operations with F are performed with an external file.

You cannot use a procedure with an already open file variable.

2. Procedure Close(varF);

The procedure breaks the link between the file variable and the external disk file and closes the file.

F is a file variable of any file type, opened by the Reset, Rewrite, or Append procedures. The external file associated with F is fully modified and then closed, freeing the file descriptor for reuse.

The {SI+} directive allows you to handle errors during program execution using exception handling. With the {$1-} directive turned off, you must use IOResult to check for I/O errors.

3.Function Eof(var F): Boolean;

{Typed or untyped files}

Function Eof[(var F: Text)]: Boolean;

{text files}

Checks whether or not the current file position is the end of the file.

Eof(F) returns True if the current file position is after the last character of the file, or if the file is empty; otherwise Eof(F) returns False.

The {SI+} directive allows you to handle errors during program execution using exception handling. With the {SI-} directive turned off, you must use IOResult to check for I/O errors.

4. Procedure Erase(var F);

Deletes the external file associated with F.

F is a file variable of any file type.

Before calling the Erase procedure, the file must be closed.

The {SI+} directive allows you to handle errors during program execution using exception handling. With the {SI-} directive turned off, you must use IOResult to check for I/O errors.

5. Function FileSize(var F): Integer;

Returns the size in bytes of file F However, if F is a typed file, FileSize will return the number of records in the file. The file must be open before using the FileSize function. If the file is empty, FileSize(F) returns zero. F is a variable of any file type.

6.Function FilePos(var F): LongInt;

Returns the current position of a file within a file.

Before using the FilePos function, the file must be open. The FilePos function is not used with text files. F is a variable of any file type, except for the Text type.

7. Procedure Reset(var F [: File; RecSize: Word]);

Opens an existing file.

F is a variable of any file type associated with an external file using AssignFile. RecSize is an optional expression that is used if F is an untyped file. If F is an untyped file, RecSize determines the record size that is used when transferring data. If RecSize is omitted, the default record size is 128 bytes.

The Reset procedure opens an existing external file associated with file variable F. If there is no external file with that name, a run-time error occurs. If the file associated with F is already open, it is first closed and then reopened. The current file position is set to the beginning of the file.

8. Procedure Rewrite(var F: File [; Recsize: Word]);

Creates and opens a new file.

F is a variable of any file type associated with an external file using AssignFile. RecSize is an optional expression that is used if F is an untyped file. If F is an untyped file, RecSize determines the record size that is used when transferring data. If RecSize is omitted, the default record size is 128 bytes.

The Rewrite procedure creates a new external file with the name associated with F. If an external file with the same name already exists, it is deleted and a new empty file is created.

9. Procedure Seek(var F; N: LongInt);

Moves the current file position to the specified component. You can only use the procedure with open typed or untyped files.

The current position of file F is moved to number N. The number of the first component of the file is 0.

The Seek(F, FileSize(F)) instruction moves the current file position to the end of the file.

10. Procedure Append(var F: Text);

Opens an existing text file to append information to the end of the file (append).

If an external file with the given name does not exist, a run-time error occurs. If file F is already open, it closes and reopens. The current file position is set to the end of the file.

11.Function Eoln[(var F: Text)]: Boolean;

Checks if the current file position is the end of a line in a text file.

Eoln(F) returns True if the current file position is at the end of a line or file; otherwise Eoln(F) returns False.

12. Procedure Read(F, V1 [, V2,..., Vn]);

{Typed and untyped files}

Procedure Read([var F: Text;] V1 [, V2,..., Vn]);

{text files}

For typed files, the procedure reads the file component into a variable. On each read, the current position in the file advances to the next element.

For text files, one or more values ​​are read into one or more variables.

With String variables, Read reads all characters up to (but not including) the next end-of-line marker, or until Eof(F) evaluates to True. The resulting character string is assigned to the variable.

In the case of a variable of an integer or real type, the procedure waits for a sequence of characters that form a number according to the rules of Pascal syntax. Reading stops when the first space, tab, or end-of-line is encountered, or when Eof(F) evaluates to True. If the numeric string does not match the expected format, then an I/O error occurs.

13. Procedure Readln([var F: Text;] V1 [, V2..., Vn]);

It is an extension of the Read procedure and is defined for text files. Reads a string of characters in the file, including the end-of-line marker, and moves to the beginning of the next line. Calling the Readln(F) function with no parameters moves the current file position to the beginning of the next line, if there is one, otherwise it jumps to the end of the file.

14. Function SeekEof[(var F: Text)]: Boolean;

Returns the end-of-file and can only be used for open text files. Typically used to read numeric values ​​from text files.

15. Function SeekEoln[(var F: Text)]: Boolean;

Returns the line terminator in a file and can only be used for open text files. Typically used to read numeric values ​​from text files.

16. Procedure Write([var F: Text;] P1 [, P2,..., Pn]);

{text files}

Writes one or more values ​​to a text file.

Each entry parameter must be of type Char, one of the integer types (Byte, ShortInt, Word, Longint, Cardinal), one of the floating point types (Single, Real, Double, Extended, Currency), one of the string types (PChar, AisiString , ShortString), or one of the boolean types (Boolean, Bool).

Procedure Write(F, V1,..., Vn);

{Typed files}

Writes a variable to a file component. Variables VI...., Vn must be of the same type as the file elements. Each time a variable is written, the current position in the file is moved to the next element.

17. Procedure Writeln([var F: Text;] [P1, P2,..., Pn]);

{text files}

Performs a Write operation, then places an end-of-line marker in the file.

Calling Writeln(F) with no parameters writes an end-of-line marker to the file. The file must be open for output.

2. Modules. Types of modules

A module (1Ж1Т) in Pascal is a specially designed library of subroutines. A module, unlike a program, cannot be launched on its own, it can only participate in building programs and other modules. Modules allow you to create personal libraries of procedures and functions and build programs of almost any size.

A module in Pascal is a separately stored and independently compiled program unit. In general, a module is a collection of software resources intended for use by other programs. Program resources are understood as any elements of the Pascal language: constants, types, variables, subroutines. The module itself is not an executable program, its elements are used by other program units.

All program elements of the module can be divided into two parts:

1) program elements intended for use by other programs or modules, such elements are called visible outside the module;

2) software elements that are necessary only for the operation of the module itself, they are called invisible (or hidden).

In accordance with this, the module, in addition to the header, contains three main parts, called interface, executable and initialized.

In general, a module has the following structure:

unit <module name>; {module title}

interface

{description of visible program elements of the module}

implementation

{description of the hidden programming elements of the module}

begin

{module element initialization statements}

end.

In a particular case, the module may not contain an implementation part and an initialization part, then the module structure will be as follows:

unit <module name>; {module title}

interface

{description of visible program elements of the module}

implementation

end.

The use of procedures and functions in modules has its own peculiarities. The subroutine header contains all the information necessary to call it: name, list and type of parameters, result type for functions. This information must be available to other programs and modules. On the other hand, the text of a subroutine that implements its algorithm cannot be used by other programs and modules. Therefore, the headings of procedures and functions are placed in the interface part of the module, and the text is placed in the implementation part.

The interface part of the module contains only visible (accessible to other programs and modules) headers of procedures and functions (without the service word forward). The full text of the procedure or function is placed in the implementation part, and the header may not contain a list of formal parameters.

The source code of the module must be compiled using the Make directive of the Compile submenu and written to disk. The result of module compilation is a file with extension . TPU (Turbo Pascal Unit). The base name of the module is taken from the header of the module.

To connect a module to the program, you must specify its name in the module description section, for example:

uses Crt, Graph;

In the event that the names of variables in the interface part of the module and in the program using this module are the same, the reference will be to the variable described in the program. To refer to a variable declared in a module, you must use a compound name consisting of the module name and the variable name, separated by a dot. The use of compound names applies not only to variable names, but to all names declared in the interface part of the module.

Recursive use of modules is prohibited.

If a module has an initialization section, then the statements in that section will be executed before the program that uses that module starts executing.

Let's list the types of modules.

1. SYSTEM module.

The SYSTEM module implements lower-level support routines for all built-in features such as I/O, string manipulation, floating point operations, and dynamic memory allocation.

The SYSTEM module contains all the standard and built-in Pascal routines and functions. Any Pascal subroutine that is not part of standard Pascal and is not found in any other module is contained in the System module. This unit is automatically used in all programs and does not need to be specified in the uses statement.

2. DOS module.

The Dos module implements numerous Pascal routines and functions that are equivalent to the most commonly used DOS calls, such as GetTime, SetTime, DiskSize, and so on.

3. CRT module.

The CRT module implements a number of powerful programs that provide full control over the PC's features such as screen mode control, extended keyboard codes, colors, windows, and sounds. The CRT module can only be used in programs that run on personal computers IBM PC, PC AT, PS / 2 from IBM and are fully compatible with them.

One of the main advantages of using the CRT module is greater speed and flexibility in screen operations. Programs that do not work with the CRT module display information on the screen using the DOS operating system, which is associated with additional overhead. When using the CRT module, output information is sent directly to the basic input/output system (BIOS) or, for even faster operations, directly to video memory.

4. GRAPH module.

Using the procedures and functions included in this module, you can create various graphics on the screen.

5. OVERLAY module.

The OVERLAY module allows you to reduce the memory requirements of a real mode DOS program. In fact, it is possible to write programs that exceed the total amount of available memory, since only part of the program will be in memory at any given moment.

LECTURE No. 7. Dynamic memory

1. Reference data type. dynamic memory. Dynamic variables

A static variable (statically allocated) is a variable declared explicitly in the program, it is referred to by name. The place in memory for placing static variables is determined when the program is compiled. Unlike such static variables, Pascal programs can create dynamic variables. The main property of dynamic variables is that they are created and memory is allocated for them during program execution.

Dynamic variables are placed in a dynamic memory area (heap-area). A dynamic variable is not specified explicitly in variable declarations and cannot be referred to by name. Such variables are accessed using pointers and references.

A reference type (pointer) defines a set of values ​​that point to dynamic variables of a specific type, called a base type. A reference type variable contains the address of a dynamic variable in memory. If the base type is an undeclared identifier, then it must be declared in the same part of the type declaration as the pointer type.

The reserved word nil denotes a constant with a pointer value that does not point to anything.

Let us give an example of the description of dynamic variables.

var p1, p2 : ^real;

p3, p4 : ^integer;

...

2. Working with dynamic memory. Untyped Pointers

Dynamic Memory Procedures and Functions

1. Procedure New(var p: Pointer).

Allocates space in the dynamic memory area to accommodate the dynamic variable pЛ, and assigns its address to the pointer p.

2. Procedure Dispose(varp: Pointer).

Frees the memory allocated for dynamic variable allocation by the New procedure, and the value of the pointer p becomes undefined.

3. Procedure GetMem(varp: Pointer; size: Word).

Allocates a memory section in the heap-area, assigns the address of its beginning to the p pointer, the size of the section in bytes is specified by the size parameter.

4. Procedure FreeMem(var p: Pointer; size: Word).

Frees the memory area, the beginning address of which is specified by the p pointer, and the size is specified by the size parameter. The value of the pointer p becomes undefined.

5. Procedure Mark(var p: Pointer)

Writes to the pointer p the address of the beginning of a section of free dynamic memory at the time of its call.

6. Procedure Release(var p: Pointer)

Releases a section of dynamic memory, starting from the address written to the pointer p by the Mark procedure, i.e. clears the dynamic memory that was occupied after the call to the Mark procedure.

7. MaxAvaikLongint Function

Returns the length, in bytes, of the longest free heap.

8. MemAvaikLongint Function

Returns the total amount of free dynamic memory in bytes.

9. Helper function SizeOf(X):Word

Returns the amount of bytes occupied by X, where X can be either a variable name of any type or a type name.

The built-in type Pointer denotes an untyped pointer, that is, a pointer that does not point to any particular type. Variables of type Pointer can be dereferenced: specifying the ^ character after such a variable causes an error.

Like the value denoted by nil, Pointer values ​​are compatible with all other pointer types.

LECTURE № 8. Abstract data structures

1. Abstract data structures

Structured data types, such as arrays, sets, and records, are static structures because their sizes do not change during the entire execution of the program.

It is often required that data structures change their sizes in the course of solving a problem. Such data structures are called dynamic. These include stacks, queues, lists, trees, etc.

Description of dynamic structures with the help of arrays, records and files leads to wasteful use of computer memory and increases the time for solving problems.

Each component of any dynamic structure is a record containing at least two fields: one field of type "pointer", and the second - for data placement. In general, a record may contain not one, but several pointers and several data fields. A data field can be a variable, an array, a set, or a record.

If the pointing part contains the address of one element of the list, then the list is called unidirectional (or singly linked). If it contains two components, then it is doubly connected. You can perform various operations on lists, for example:

1) adding an element to the list;

2) removing an element from the list with a given key;

3) search for an element with a given value of the key field;

4) sorting the elements of the list;

5) division of the list into two or more lists;

6) combining two or more lists into one;

7) other operations.

However, as a rule, the need for all operations in solving various problems does not arise. Therefore, depending on the basic operations that need to be applied, there are different types of lists. The most popular of these are stack and queue.

2. Stacks

A stack is a dynamic data structure, the addition of a component to which and the removal of a component from which are made from one end, called the top of the stack. The stack works on the principle of LIFO (Last-In, First-Out) - "Last in, first out".

There are usually three operations performed on stacks:

1) initial formation of the stack (record of the first component);

2) adding a component to the stack;

3) selection of the component (deletion).

To form a stack and work with it, you must have two variables of the "pointer" type, the first of which determines the top of the stack, and the second is auxiliary.

Example. Write a program that forms a stack, adds an arbitrary number of components to it, and then reads all the components and displays them on the display screen. Take a character string as data. Data input - from the keyboard, a sign of the end of input - a string of characters END.

Program STACK;

uses Crt;

type

Alpha = String[10];

PComp = ^Comp;

Comp = record

SD : Alfa

pNext : PComp

end;

var

pTop:PComp;

sc: Alfa;

Create ProcedureStack(var pTop : PComp; var sC : Alfa);

begin

New(pTop);

pTop^.pNext := NIL;

pTop^.sD := sC;

end;

Add ProcedureComp(var pTop : PComp; var sC : Alfa);

var pAux : PComp;

begin

NEW(pAux);

pAux^.pNext := pTop;

pTop := pAux;

pTop^.sD := sC;

end;

Procedure DelComp(var pTop : PComp; var sC : ALFA);

begin

sC := pTop^.sD;

pTop := pTop^.pNext;

end;

begin

Clrscr;

writeln(' ENTER STRING ');

readln(sC);

CreateStack(pTop, sc);

repeat

writeln(' ENTER STRING ');

readln(sC);

AddComp(pTop, sc);

until sC = 'END';

writeln('****** OUTPUT ******');

repeat

DelComp(pTop, sc);

writeln(sC);

until pTop = NIL;

end.

3. Queues

A queue is a dynamic data structure where a component is added at one end and retrieved at the other end. The queue works on the principle of FIFO (First-In, First-Out) - "First in, first served".

To form a queue and work with it, it is necessary to have three variables of the pointer type, the first of which determines the beginning of the queue, the second - the end of the queue, the third - auxiliary.

Example. Write a program that forms a queue, adds an arbitrary number of components to it, and then reads all the components and displays them on the display screen. Take a character string as data. Data input - from the keyboard, a sign of the end of input - a string of characters END.

ProgramQUEUE;

uses Crt;

type

Alpha = String[10];

PComp = ^Comp;

Comp = record

SD : Alfa

pNext : PComp;

end;

var

pBegin, pEnd : PComp;

sc: Alfa;

Create ProcedureQueue(var pBegin,pEnd:PComp; var sC:Alfa);

begin

New(pBegin);

pBegin^.pNext := NIL;

pBegin^.sD := sC;

pEnd := pBegin;

end;

Procedure Add ProcedureQueue(var pEnd : PComp; var sC : Alfa);

var pAux : PComp;

begin

New(pAux);

pAux^.pNext := NIL;

pEnd^.pNext := pAux;

pEnd := pAux;

pEnd^.sD := sC;

end;

Procedure DelQueue(var pBegin : PComp; var sC : Alfa);

begin

sC := pBegin^.sD;

pBegin := pBegin^.pNext;

end;

begin

Clrscr;

writeln(' ENTER STRING ');

readln(sC);

CreateQueue(pBegin, pEnd, sc);

repeat

writeln(' ENTER STRING ');

readln(sC);

AddQueue(pEnd, sc);

until sC = 'END';

writeln(' ***** DISPLAY RESULTS *****');

repeat

DelQueue(pBegin, sc);

writeln(sC);

until pBegin = NIL;

end.

LECTURE No. 9. Tree-like data structures

1. Tree Data Structures

A tree-like data structure is a finite set of elements-nodes between which there are relations - the connection between the source and the generated.

If we use the recursive definition proposed by N. Wirth, then a tree data structure with base type t is either an empty structure or a node of type t, with which a finite set of tree structures with base type t, called subtrees, is associated.

Next, we give the definitions used when operating with tree structures.

If node y is located directly below node x, then node y is called the immediate descendant of node x, and x is the immediate ancestor of node y, i.e., if node x is at the i-th level, then node y is accordingly at (i + 1) - th level.

The maximum level of a tree node is called the height or depth of the tree. An ancestor does not have only one node of the tree - its root.

Tree nodes that have no children are called leaf nodes (or leaves of the tree). All other nodes are called internal nodes. The number of immediate children of a node determines the degree of that node, and the maximum possible degree of a node in a given tree determines the degree of the tree.

Ancestors and descendants cannot be interchanged, i.e., the connection between the original and the generated acts only in one direction.

If you go from the root of the tree to some particular node, then the number of branches of the tree that will be traversed in this case is called the length of the path for this node. If all branches (nodes) of a tree are ordered, then the tree is said to be ordered.

Binary trees are a special case of tree structures. These are trees in which each child has at most two children, called the left and right subtrees. Thus, a binary tree is a tree structure whose degree is two.

The ordering of a binary tree is determined by the following rule: each node has its own key field, and for each node the key value is greater than all keys in its left subtree and less than all keys in its right subtree.

A tree whose degree is greater than two is called strongly branched.

2. Operations on trees

Further, we will consider all operations in relation to binary trees.

I. Tree construction

We present an algorithm for constructing an ordered tree.

1. If the tree is empty, then the data is transferred to the root of the tree. If the tree is not empty, then one of its branches is descended in such a way that the order of the tree is not violated. As a result, the new node becomes the next leaf of the tree.

2. To add a node to an already existing tree, you can use the above algorithm.

3. When deleting a node from the tree, you should be careful. If the node to be removed is a leaf, or has only one child, then the operation is simple. If the node to be deleted has two descendants, then it will be necessary to find a node among its descendants that can be put in its place. This is necessary due to the requirement that the tree be ordered.

You can do this: swap the node to be removed with the node with the largest key value in the left subtree, or with the node with the smallest key value in the right subtree, and then delete the desired node as a leaf.

II. Finding a node with a given key field value

When performing this operation, it is necessary to traverse the tree. It is necessary to take into account different forms of tree notation: prefix, infix and postfix.

The question arises: how to represent the nodes of the tree so that it is most convenient to work with them? It is possible to represent a tree using an array, where each node is described by a combined type value, which has a character-type information field and two reference-type fields. But this is not very convenient, since the trees have a large number of nodes that are not predetermined. Therefore, it is best to use dynamic variables when describing a tree. Then each node is represented by a value of the same type, which contains a description of a given number of information fields, and the number of corresponding fields must be equal to the degree of the tree. It is logical to define the absence of descendants with nil. Then, in Pascal, the description of a binary tree might look like this:

TYPE TreeLink = ^Tree;

tree = record;

Inf : <datatype>;

Left, Right : TreeLink;

End.

3. Examples of the implementation of operations

1. Construct a tree of n nodes of minimum height, or a perfectly balanced tree (the number of nodes of the left and right subtrees of such a tree must differ by no more than one).

Recursive construction algorithm:

1) the first node is taken as the root of the tree.

2) the left subtree of nl nodes is built in the same way.

3) the right subtree of nr nodes is built in the same way;

nr = n - nl - 1. As an information field, we will take the node numbers entered from the keyboard. The recursive function that implements this construction will look like this:

Function Tree(n : Byte) : TreeLink;

Var t : TreeLink; nl,nr,x : Byte;

Begin

If n = 0 then Tree := nil

else

Begin

nl := n div 2;

nr = n - nl - 1;

writeln('Enter vertex number ');

readln (x);

new(t);

t^.inf := x;

t^.left := Tree(nl);

t^.right := Tree(nr);

Tree := t;

End;

{Tree}

End.

2. In the binary ordered tree, find the node with the given value of the key field. If there is no such element in the tree, then add it to the tree.

Search Procedure(x : Byte; var t : TreeLink);

Begin

If t = nil then

Begin

New(t);

t^inf := x;

t^.left := nil;

t ^ .right: = nil;

End

Else if x < t^.inf then

Search(x, t^.left)

Else if x > t^.inf then

Search(x, t^.right)

else

Begin

{process found element}

...

End;

End.

3. Write tree traversal procedures in forward, symmetric and reverse order, respectively.

3.1. Procedure Preorder(t : TreeLink);

Begin

If t <> nil then

Begin

WriteIn(t^.inf);

Preorder(t^.left);

Preorder(t^.right);

End;

End;

3.2. Procedure Inorder(t : TreeLink);

Begin

If t <> nil then

Begin

Inorder(t^.left);

WriteIn(t^.inf);

Inorder(t^.right);

End;

End.

3.3. Procedure Postorder(t : TreeLink);

Begin

If t <> nil then

Begin

postorder(t^.left);

postorder(t^.right);

WriteIn(t^.inf);

End;

End.

4. In the binary ordered tree, delete the node with the given value of the key field.

Let us describe a recursive procedure that will take into account the presence of the required element in the tree and the number of descendants of this node. If the node to be deleted has two children, then it will be replaced by the largest key value in its left subtree, and only then will it be permanently deleted.

Procedure Delete1(x : Byte; var t : TreeLink);

Var p : TreeLink;

Procedure Delete2(var q : TreeLink);

Begin

If q^.right <> nil then Delete2(q^.right)

else

Begin

p^.inf := q^.inf;

p := q;

q := q^.left;

End;

End;

Begin

If t = nil then

Writeln('no element found')

Else if x < t^.inf then

Delete1(x, t^.left)

Else if x > t^.inf then

Delete1(x, t^.right)

else

Begin

P := t;

If p^.left = nil then

t := p^.right

else

If p^.right = nil then

t := p^.left

else

Delete2(p^.left);

End;

End.

LECTURE No. 10. Counts

1. The concept of a graph. Ways to represent a graph

A graph is a pair G = (V,E), where V is a set of objects of an arbitrary nature, called vertices, and E is a family of pairs ei = (vil, vi2), vijOV, called edges. In the general case, the set V and/or the family E may contain an infinite number of elements, but we will consider only finite graphs, that is, graphs for which both V and E are finite. If the order of the elements included in ei matters, then the graph is called directed, abbreviated - digraph, otherwise - undirected. The edges of a digraph are called arcs. In what follows, we assume that the term "graph", used without specification (directed or undirected), denotes an undirected graph.

If e = , then the vertices v and u are called the ends of the edge. Here we say that the edge e is adjacent (incident) to each of the vertices v and u. Vertices v and and are also called adjacent (incident). In the general case, edges of the form e = ; such edges are called loops.

The degree of a vertex in a graph is the number of edges incident to that vertex, with loops being counted twice. Since each edge is incident to two vertices, the sum of the degrees of all vertices in the graph is equal to twice the number of edges: Sum(deg(vi), i=1...|V|) = 2 * |E|.

A node's weight is a number (real, integer, or rational) assigned to a given node (interpreted as cost, throughput, etc.). Weight, edge length - a number or several numbers that are interpreted as length, bandwidth, etc.

A path in a graph (or a route in a digraph) is an alternating sequence of vertices and edges (or arcs in a digraph) of the form v0, (v0,v1), v1..., (vn - 1,vn), vn. The number n is called the path length. A path without repeating edges is called a chain; a path without repeating vertices is called a simple chain. The path can be closed (v0 = vn). A closed path without repeating edges is called a cycle (or contour in a digraph); without repeating vertices (except for the first and last) - a simple loop.

A graph is called connected if there is a path between any two of its vertices, and disconnected otherwise. A disconnected graph consists of several connected components (connected subgraphs).

There are various ways to represent graphs. Let's consider each of them separately.

1. Incidence matrix.

This is a rectangular matrix of dimension n x n, where n is the number of vertices, am is the number of edges. The values ​​of the matrix elements are determined as follows: if the edge xi and the vertex vj are incident, then the value of the corresponding matrix element is equal to one, otherwise the value is zero. For directed graphs, the incidence matrix is ​​constructed according to the following principle: the value of the element is equal to - 1 if edge xi comes from vertex vj, equal to 1 if edge xi enters vertex vj, and equal to XNUMX otherwise.

2. Adjacency matrix.

This is a square matrix of dimensions n x n, where n is the number of vertices. If vertices vi and vj are adjacent, that is, if there is an edge connecting them, then the corresponding matrix element is equal to one, otherwise it is equal to zero. The rules for constructing this matrix for directed and undirected graphs are no different. The adjacency matrix is ​​more compact than the incidence matrix. It should be noted that this matrix is ​​also very sparse, but in the case of an undirected graph it is symmetrical with respect to the main diagonal, so you can store not the entire matrix, but only half of it (a triangular matrix).

3. List of adjacencies (incidents).

It is a data structure that stores a list of adjacent vertices for each graph vertex. The list is an array of pointers, the i-th element of which contains a pointer to the list of vertices adjacent to the i-th vertex.

An adjacency list is more efficient than an adjacency matrix because it eliminates the storage of null elements.

4. List of lists.

It is a tree-like data structure in which one branch contains lists of vertices adjacent to each of the graph vertices, and the second branch points to the next graph vertex. This way of representing the graph is the most optimal.

2. Representation of a graph by an incidence list. Graph Depth Traversal Algorithm

To implement a graph as an incidence list, you can use the following type:

TypeList = ^S;

S=record;

inf: Byte;

next : List;

end;

Then the graph is defined as follows:

Var Gr : array[1..n] of List;

Now let's turn to the graph traversal procedure. This is an auxiliary algorithm that allows you to view all the vertices of the graph, analyze all information fields. If we consider a graph traversal in depth, then there are two types of algorithms: recursive and non-recursive.

With the recursive depth-first traversal algorithm, we take an arbitrary vertex and find an arbitrary unseen (new) vertex v adjacent to it. Then we take the vertex v as not new and find any new vertex adjacent to it. If some vertex does not have newer unseen vertices, then we consider this vertex to be used and return one level higher to the vertex from which we got to our used vertex. The traversal continues in this way until there are no new unscanned vertices in the graph.

In Pascal, the depth-first traversal procedure would look like this:

Procedure Obhod(gr : Graph; k : Byte);

Var g : Graph; l : List;

Begin

nov[k] := false;

g := gr;

While g^.inf <> k do

g := g^.next;

l := g^.smeg;

While l <> nil do begin

If nov[l^.inf] then Obhod(gr, l^.inf);

l := l^.next;

End;

End;

Note

In this procedure, when describing the Graph type, we meant the description of a graph by a list of lists. Array nov[i] is a special array whose i-th element is True if the i-th vertex is not visited, and False otherwise.

A non-recursive traversal algorithm is also often used. In this case, recursion is replaced by a stack. Once a vertex has been viewed, it is pushed onto the stack, and it becomes used when there are no more new vertices adjacent to it.

3. Representation of a graph by a list of lists. Breadth Graph Traversal Algorithm

A graph can be defined using a list of lists as follows:

TypeList = ^Tlist;

tlist=record

inf: Byte;

next : List;

end;

Graph = ^TGpaph;

TGpaph = record

inf: Byte;

smeg:List;

next: Graph;

end;

When traversing the graph in breadth, we select an arbitrary vertex and look through all the vertices adjacent to it at once. A queue is used instead of a stack. The breadth-first search algorithm is very handy for finding the shortest path in a graph.

Here is a procedure for traversing a graph in width in pseudocode:

Procedure Obhod2(v);

{values ​​spisok, nov - global}

Begin

queue = O;

queue <= v;

nov[v] = False;

While queue <> O do

Begin

p <= queue;

For u in spisok(p) do

If new[u] then

Begin

nov[u] := False;

queue <= u;

End;

End;

End;

LECTURE #11. Object data type

1. Object type in Pascal. The concept of an object, its description and use

Historically, the first approach to programming has been procedural programming, otherwise known as bottom-up programming. Initially, common libraries of standard programs used in various fields of computer application were created. Then, based on these programs, more complex programs were created to solve specific problems.

However, computer technology was constantly developing, it began to be used to solve various problems of production, the economy, and therefore it became necessary to process data of various formats and solve non-standard problems (for example, non-numerical ones). Therefore, when developing programming languages, they began to pay attention to the creation of various types of data. This contributed to the emergence of such complex data types as combined, multiple, string, file, etc. Before solving the problem, the programmer carried out decomposition, i.e., splitting the task into several subtasks, for each of which a separate module was written. The main programming technology included three stages:

1) top-down design;

2) modular programming;

3) structural coding.

But starting from the mid-60s of the XX century, new concepts and approaches began to form, which formed the basis of the technology of object-oriented programming. In this approach, modeling and description of the real world is carried out at the level of concepts of a specific subject area to which the problem being solved belongs.

Object-oriented programming is a programming technique that closely resembles our behavior. It is a natural evolution of earlier innovations in programming language design. Object-oriented programming is more structural than all previous developments regarding structured programming. It is also more modular and more abstract than previous attempts at data abstraction and programming details internally. An object-oriented programming language is characterized by three main properties:

1) Encapsulation. Combining records with procedures and functions that manipulate the fields of these records forms a new data type - an object;

2) Inheritance. Definition of an object and its further use to build a hierarchy of child objects with the ability for each child object related to the hierarchy to access the code and data of all parent objects;

3) Polymorphism. Giving an action a single name, which is then shared up and down the hierarchy of objects, with each object in the hierarchy performing that action in a way that suits it.

Speaking of the object, we introduce a new data type - object. An object type is a structure consisting of a fixed number of components. Each component is either a field containing data of a strictly defined type, or a method that performs operations on an object. By analogy with the declaration of variables, the declaration of a field specifies the data type of this field and the identifier naming the field: by analogy with the declaration of a procedure or function, the declaration of a method specifies the title of a procedure, function, constructor, or destructor.

An object type can inherit components of another object type. If type T2 inherits from type T1, then type T2 is a child of type T1, and type T1 itself is a parent of type T2. Inheritance is transitive, i.e. if TK inherits from T2, and T2 inherits from T1, then TK inherits from T1. The scope (domain) of an object type consists of itself and all its descendants.

The following source code is an example of an object type declaration, type

type

point = object

X, Y: integer;

end;

Rect = object

A, B: TPoint;

procedure Init(XA, YA, XB, YB: Integer);

procedure Copy(var R: TRectangle);

procedure Move(DX, DY: Integer);

procedure Grow(DX, DY: Integer);

procedure Intersect(var R: TRectangle);

procedure Union(var R: TRectangle);

function Contains(P: Point): Boolean;

end;

StringPtr = ^String;

FieldPtr = ^TField;

TField = object

X, Y, Len: Integer;

Name: StringPtr;

constructor Copy(var F: TField);

constructor Init(FX, FY, FLen: Integer; FName: String);

destructor Done; virtual;

procedure Display; virtual;

procedure Edit; virtual;

function GetStr: String; virtual;

function PutStr(S: String): Boolean; virtual;

end;

StrFieldPtr = ^TStrField;

StrField = object(TField)

Value: PString;

constructor Init(FX, FY, FLen: Integer; FName: String);

destructor Done; virtual;

function GetStr: String; virtual;

function PutStr(S: String): Boolean;

virtual;

function Get:string;

procedure Put(S: String);

end;

NumFieldPtr = ^TNumField;

TNumField = object(TField)

private

Value, Min, Max: Longint;

public insurance

constructor Init(FX, FY, FLen: Integer; FName: String;

FMin, FMax: Longint);

function GetStr: String; virtual;

function PutStr(S: String): Boolean; virtual;

function Get: Longint;

function Put(N: Longint);

end;

ZipFieldPtr = ^TZipField;

ZipField = object(TNumField)

function GetStr: String; virtual;

function PutStr(S: String): Boolean;

virtual;

end.

Unlike other types, object types can only be declared in the type declaration section at the outermost level of the scope of a program or module. Thus, object types cannot be declared in a variable declaration section or inside a procedure, function, or method block.

A file type component type cannot have an object type or any structure type containing object type components.

2. Inheritance

The process by which one type inherits the characteristics of another type is called inheritance. The descendant is called the derived (child) type, and the type that the child type inherits from is called the parent (parent) type.

Previously known Pascal record types cannot inherit. However, Borland Pascal extends the Pascal language to support inheritance. One of these extensions is a new data structure category related to records, but much more powerful. The data types in this new category are defined using the new reserved word "object". An object type can be defined as a complete, independent type in the manner of describing Pascal entries, but it can also be defined as a descendant of an existing object type by putting the parent type in parentheses after the reserved word "object".

3. Instantiate Objects

An instance of an object is created by declaring a variable or constant of an object type, or by applying the standard New procedure to a variable of type "pointer to object type". The resulting object is called an instance of the object type;

var

F: TField;

Z: TZipField;

FP:PField;

ZP: PZipField;

Given these variable declarations, F is an instance of TField and Z is an instance of TZipField. Similarly, after applying New to FP and ZP, FP will point to a TField instance, and ZP will point to a TZipField instance.

If an object type contains virtual methods, then instances of that object type must be initialized by calling a constructor before calling any virtual method.

Below is an example:

var

S: StrField;

bend

S.Init(1, 1, 25, 'First name');

S.Put('Vladimir');

S.Display;

...

S Done;

end.

If S.Init was not called, then calling S.Display will cause this example to fail.

Assigning an instance of an object type does not imply initialization of the instance. An object is initialized by compiler-generated code that runs between the invocation of the constructor and the point at which execution actually reaches the first statement in the constructor's code block.

If the object instance is not initialized and range checking is enabled (by the {SR+} directive), then the first call to the virtual method of the object instance gives a run-time error. If range checking is disabled (by the {SR-} directive), then the first call to a virtual method of an uninitialized object can lead to unpredictable behavior.

The mandatory initialization rule also applies to instances that are components of struct types. For example:

var

Comment: array [1..5] of TStrField;

I: integer

begin

for I := 1 to 5 do

Comment [I].Init (1, I + 10, 40, 'first_name');

.

.

.

for I := 1 to 5 do Comment [I].Done;

end;

For dynamic instances, initialization is typically about placement and cleanup is about deletion, which is achieved through the extended syntax of the New and Dispose standard procedures. For example:

var

SP: StrFieldPtr;

begin

New(SP, Init(1, 1, 25, 'first_name');

SP^.Put('Vladimir');

SP^.Display;

.

.

.

Dispose(SP, Done);

end.

A pointer to an object type is assignment compatible with a pointer to any parent object type, so at runtime a pointer to an object type can point to an instance of that type or to an instance of any child type.

For example, a pointer of type ZipFieldPtr can be assigned to pointers of type PZipField, PNumField, and PField, and at runtime, a pointer of type PField can either be nil or point to an instance of TField, TNumField, or TZipField, or any instance of a child type of TField. .

These assignment pointer compatibility rules also apply to object type parameters. For example, the TField.Cop method can be passed instances of TField, TStrField, TNumField, TZipField, or any other type child of TField.

4. Components and Scope

The scope of a bean identifier extends beyond the object type. Moreover, the scope of a bean identifier extends through the blocks of procedures, functions, constructors, and destructors that implement the methods of the object type and its descendants. Based on these considerations, the spelling of the component identifier must be unique within the object type and within all its descendants, as well as within all its methods.

The scope of the component identifier described in the private part of the type declaration is limited to the module (program) that contains the object type declaration. In other words, private identifier beans act like ordinary public identifiers within the module that contains the object type declaration, and outside the module any private beans and identifiers are unknown and inaccessible. By putting related types of objects in the same module, you can make sure that these objects can access each other's private components, and these private components will be unknown to other modules.

In an object type declaration, a method header may specify the parameters of the object type being described, even if the description is not yet complete.

LECTURE No. 12. Methods

1. Methods

A method declaration inside an object type corresponds to a forward method declaration (forward). Thus, somewhere after an object type declaration, but within the same scope as the scope of the object type declaration, a method must be implemented by defining its declaration.

For procedural and functional methods, the defining declaration takes the form of a normal procedure or function declaration, with the exception that in this case the procedure or function identifier is treated as a method identifier.

For constructor and destructor methods, the defining declaration takes the form of a procedure method declaration, with the exception that the reserved word procedure is replaced by the reserved word constructor or destructor.

The defining method declaration may, but need not, repeat the list of formal parameters of the method header in the object type. In this case, the method header must exactly match the header in the object type in order, types, and parameter names, and in the return type of the function result if the method is a function.

The defining description of a method always contains an implicit parameter with the identifier Self, corresponding to a formal variable parameter that has an object type. Within a method block, Self represents the instance whose method component was specified to invoke the method. Thus, any changes to the values ​​of the Self fields are reflected in the instance.

The scope of an object type bean identifier extends to blocks of procedures, functions, constructors, and destructors that implement methods of that object type. The effect is the same as if a with statement of the following form were inserted at the beginning of the method block:

with self do

begin

...

end;

Based on these considerations, the spelling of component identifiers, formal method parameters, Self, and any identifier introduced into the executable part of the method must be unique.

If a unique method identifier is required, then the qualified method identifier is used. It consists of an object type identifier followed by a dot and a method identifier. As with any other identifier, a qualified method identifier may optionally be preceded by a package identifier and a period.

Virtual Methods

Methods are static by default, but except for constructors, they can be virtual (by including the virtual directive in the method declaration). The compiler resolves references to static method calls during the compilation process, while calls to virtual methods are resolved at run time. This is sometimes called late binding.

If an object type declares or inherits any virtual method, then the variables of that type must be initialized by calling a constructor before calling any virtual method. Thus, an object type that describes or inherits a virtual method must also describe or inherit at least one constructor method.

An object type can override any of the methods it inherits from its parents. If a method declaration in a child specifies the same method identifier as a method declaration in the parent, then the declaration in the child overrides the declaration in the parent. The scope of an overriding method expands to the scope of the child in which the method was introduced, and will remain so until the method identifier is overridden again.

Overriding a static method is independent of changing the method header. In contrast, a virtual method override must preserve the order, parameter types and names, and function result types, if any. Moreover, the redefinition must again include the virtual directive.

Dynamic Methods

Borland Pascal supports additional late-bound methods called dynamic methods. Dynamic methods differ from virtual methods only in the way they are dispatched at runtime. In all other respects, dynamic methods are considered equivalent to virtual methods.

A dynamic method declaration is equivalent to a virtual method declaration, but the dynamic method declaration must include the dynamic method index, which is specified immediately after the virtual keyword. The index of a dynamic method must be an integer constant between 1 and 656535 and must be unique among the indexes of other dynamic methods contained in the object type or its ancestors. For example:

procedure FileOpen(var Msg: TMessage); virtual 100;

An override of a dynamic method must match the order, types, and names of the parameters, and exactly match the result type of the function of the parent method. The override must also include a virtual directive followed by the same dynamic method index that was specified in the ancestor object type.

2. Constructors and destructors

Constructors and destructors are specialized forms of methods. Used in connection with the extended syntax of the New and Dispose standard procedures, constructors and destructors have the ability to place and remove dynamic objects. In addition, constructors have the ability to perform the required initialization of objects containing virtual methods. Like all methods, constructors and destructors can be inherited, and objects can contain any number of constructors and destructors.

Constructors are used to initialize newly created objects. Typically, initialization is based on the values ​​passed to the constructor as parameters. A constructor cannot be virtual because the dispatch mechanism of a virtual method depends on the constructor that initialized the object first.

Here are some examples of constructors:

constructor Field.Copy(var F: Field);

begin

Self := F;

end;

constructor Field.Init(FX, FY, FLen: integer; FName: string);

begin

X := FX;

Y := FY;

GetMem(Name, Length(FName) + 1);

Name^ := FName;

end;

constructor TStrField.Init(FX, FY, FLen: integer; FName: string);

begin

inherited Init(FX, FY, FLen, FName);

Field.Init(FX, FY, FLen, FName);

GetMem(Value, Len);

Value^ := '';

end;

The main action of a constructor of a derived (child) type, such as the above TStr Field. Init is almost always a call to the appropriate constructor of its immediate parent to initialize the inherited fields of the object. After executing this procedure, the constructor initializes the fields of the object that belong only to the derived type.

Destructors are the opposite of constructors and are used to clean up objects after they have been used. Normally, cleanup consists of removing all pointer fields in the object.

Note

A destructor can be virtual, and often is. A destructor rarely has parameters.

Here are some examples of destructors:

destructor Field Done;

begin

FreeMem(Name, Length(Name^) + 1);

end;

destructor StrField.Done;

begin

FreeMem(Value, Len);

Field Done;

end;

The destructor of a child type, such as the TStrField above. Done, typically first removes the pointer fields introduced in the derived type, and then, as a last step, calls the appropriate collector-destructor of the immediate parent to remove the object's inherited pointer fields.

3. Destructors

Borland Pascal provides a special type of method called a garbage collector (or destructor) for cleaning up and deleting a dynamically allocated object. The destructor combines the step of deleting an object with any other actions or tasks required for that type of object. You can define multiple destructors for a single object type.

The destructor is defined along with all other object methods in the object's type definition:

type

Temployee = object

Name: string[25];

Title: string[25];

Rate: Real;

constructor Init(AName, ATitle: String; ARate: Real);

destructor Done; virtual;

function GetName: String;

function GetTitle: String;

function GetRate: Rate; virtual;

function GetPayAmount: Real; virtual;

end;

Destructors can be inherited and they can be either static or virtual. Since different finalizers tend to require different types of objects, it is generally recommended that destructors are always virtual so that the correct destructor is executed for each type of object.

The reserved word destructor does not need to be specified for every cleanup method, even if the object's type definition contains virtual methods. Destructors really only work on dynamically allocated objects.

When a dynamically allocated object is cleaned up, the destructor performs a special function: it ensures that the correct number of bytes are always freed in the dynamically allocated memory area. There can be no concern about using a destructor with statically allocated objects; in fact, by not passing the object's type to the destructor, the programmer deprives an object of that type of the full benefits of dynamic memory management in Borland Pascal.

Destructors actually become themselves when polymorphic objects must be cleared and when the memory they occupy must be deallocated.

Polymorphic objects are those objects that have been assigned to a parent type due to Borland Pascal's extended type compatibility rules. An instance of an object of type THourly assigned to a variable of type TEmployee is an example of a polymorphic object. These rules can also be applied to objects; a pointer to THourly can freely be assigned to a pointer to TEmployee, and the object pointed to by that pointer will again be a polymorphic object. The term "polymorphic" is appropriate because code that processes an object "doesn't know" exactly at compile time what type of object it will eventually need to process. The only thing it knows is that this object belongs to a hierarchy of objects that are descendants of the specified object type.

Obviously, the sizes of the object types are different. So when it comes time to clean up a heap-allocated polymorphic object, how does Dispose know how many bytes of heap space to free? At compile time, no information about the object's size can be extracted from a polymorphic object.

The destructor solves this puzzle by referring to the place where this information is written - in the TCM implementation variables. Each TBM of an object type contains the size in bytes of that object type. The table of virtual methods of any object is available through the hidden parameter Self, sent to the method when the method is called. A destructor is just a type of method, and so when an object calls it, the destructor gets a copy of Self on the stack. Thus, if an object is polymorphic at compile time, it will never be polymorphic at run time due to late binding.

To perform this late-bound deallocation, the destructor must be called as part of the extended syntax of the Dispose procedure:

Dispose(P, Done);

(A call to the destructor outside of the Dispose procedure does not deallocate any memory at all.) What is really happening here is that the garbage collector of the object pointed to by P is executed as a normal method. However, once the last action is complete, the destructor looks up the size of its type's implementation in the TCM and passes the size on to the Dispose procedure. The Dispose procedure terminates the process by deleting the correct number of bytes of the heap space that (the space) previously belonged to P^. The number of bytes to be freed will be correct regardless of whether P pointed to an instance of type TSalaried, or whether it pointed to one of the child types of type TSalaried, such as TCommissioned.

Note that the destructor method itself can be empty and perform only this function:

destructorAnObject.Done;

begin

end;

What is useful in this destructor is not the property of its body, however, the compiler generates epilogue code in response to the destructor reserved word. It's like a module that doesn't export anything, but does some invisible work by executing its initialization section before starting the program. All action takes place behind the scenes.

4. Virtual Methods

A method becomes virtual if its object type declaration is followed by the new reserved word virtual. If a method in a parent type is declared as virtual, then all methods with the same name in child types must also be declared virtual to avoid a compiler error.

The following are the objects from the example payroll, properly virtualized:

type

PEmployee = ^TEmployee;

Temployee = object

Name, Title: string[25];

Rate: Real;

constructor Init(AName, ATitle: String; ARate: Real);

function GetPayAmount: Real; virtual;

function GetName : String;

function GetTitle : String;

function GetRate : Real;

procedure Show; virtual;

end;

PHourly = ^THourly;

THourly = object(TEmployee);

Time: Integer;

constructor Init(AName, ATitle: String; ARate: Real; Time: Integer);

function GetPayAmount: Real; virtual;

function GetTime: Integer;

end;

PSalaried = ^TSalaried;

TSalaried = object(TEmployee);

function GetPayAmount: Real; virtual;

end;

PCommissioned = ^TCommissioned;

TCommissioned = object(Salaried);

Commission: Real;

Sales Amount : Real;

constructor Init(AName, ATitle: String; ARate,

ACommission, ASalesAmount: Real);

function GetPayAmount: Real; virtual;

end;

A constructor is a special type of procedure that does some setup work for the virtual method mechanism. Moreover, the constructor must be called before any virtual method is called. Calling a virtual method without first calling the constructor can block the system, and the compiler has no way to check the order in which the methods are called.

Every object type that has virtual methods must have a constructor.

A warning

The constructor must be called before any other virtual method is called. Calling a virtual method without a previous call to the constructor can cause a system lock and the compiler cannot check the order in which the methods are called.

Note

For object constructors, it is suggested to use the identifier Init.

Each distinct object instance must be initialized with a separate constructor call. It is not enough to initialize one instance of an object and then assign that instance to others. Other instances, even if they may contain valid data, will not be initialized with an assignment operator and will block the system on any calls to their virtual methods. For example:

var

FBee, GBee: Bee; { create two Bee instances }

begin

FBee.Init(5, 9) { constructor call for FBee }

GBee := FBee; { Gbee is invalid! }

end;

What exactly does a constructor create? Each object type contains something called a virtual method table (VMT) in the data segment. The TVM contains the size of the object type and, for each virtual method, a pointer to the code that executes that method. A constructor establishes a relationship between the object's calling implementation and the object's type TCM.

It is important to remember that there is only one TBM for each object type. Separate instances of an object type (i.e., variables of this type) contain only the connection to the TBM, but not the TBM itself. The constructor sets the value of this connection to TBM. It is because of this that nowhere can you start execution before calling the constructor.

5. Object data fields and formal method parameters

The implication of the fact that methods and their objects share a common scope is that a method's formal parameters cannot be identical to any of the object's data fields. This is not some new limitation imposed by object-oriented programming, but rather the same old scope rules that Pascal has always had. This is the same as preventing a procedure's formal parameters from being identical to the procedure's local variables:

procedure CrunchIt(Crunchee: MyDataRec, Crunchby,

ErrorCode: integer);

var

A, B: char;

ErrorCode: integer;

begin

.

.

.

A procedure's local variables and its formal parameters share a common scope and therefore cannot be identical. You will get "Error 4: Duplicate identifier" if you try to compile something like this, the same error occurs when you try to set a formal method parameter to the name of the field of the object to which the method belongs.

The circumstances are somewhat different, as putting the procedure header inside a data structure is a nod to an innovation in Turbo Pascal, but the basic principles of Pascal scope have not changed.

LECTURE No. 13. Compatibility of object types

1. Encapsulation

The combination of code and data in an object is called encapsulation. In principle, it is possible to provide enough methods so that the user of an object will never access the object's fields directly. Some other object-oriented languages, such as Smalltalk, require mandatory encapsulation, but Borland Pascal has a choice.

For example, the TEmployee and THourly objects are written in such a way that there is absolutely no need to directly access their internal data fields:

type

Temployee = object

Name, Title: string[25];

Rate: Real;

procedure Init(AName, ATitle: string; ARate: Real);

function GetName : String;

function GetTitle : String;

function GetRate : Real;

function GetPayAmount: Real;

end;

THourly = object(TEmployee)

Time: Integer;

procedure Init(AName, ATitle: string; ARate:

Real, Atime: Integer);

function GetPayAmount: Real;

end;

There are only four data fields here: Name, Title, Rate and Time. The GetName and GetTitle methods display the worker's last name and position, respectively. The GetPayAmount method uses Rate, and in the case of a working THourly and Time to calculate the amount of payments to the working. There is no longer a need to refer directly to these data fields.

Assuming the existence of an AnHourly instance of type THourly, we could use a set of methods to manipulate AnHourly data fields like this:

with An hourly do

begin

Init (Aleksandr Petrov, Fork lift operator' 12.95, 62);

{Displays the last name, position and amount of payments}

show;

end;

It should be noted that access to the fields of an object is carried out only with the help of methods of this object.

2. Expanding Objects

Unfortunately, standard Pascal does not provide any facilities for creating flexible procedures that allow you to work with completely different data types. Object-oriented programming solves this problem with inheritance: if a derived type is defined, then the methods of the parent type are inherited, but they can be overridden if desired. To override an inherited method, simply declare a new method with the same name as the inherited method, but with a different body and (if necessary) a different set of parameters.

Let's define a child type of TEmployee that represents an employee who is paid an hourly rate in the following example:

const

PayPeriods = 26; { payment periods }

Overtime Threshold = 80; { for the payment period }

OvertimeFactor = 1.5; { hourly rate }

type

THourly = object(TEmployee)

Time: Integer;

procedure Init(AName, ATitle: string; ARate:

Real, Atime: Integer);

function GetPayAmount: Real;

end;

procedure THourly.Init(AName, ATitle: string;

ARate: Real, Atime: Integer);

begin

TEmployee.Init(AName, ATitle, ARate);

Time := ATime;

end;

function THourly.GetPayAmount: Real;

var

Overtime: Integer;

begin

Overtime := Time - OvertimeThreshold;

if Overtime > 0 then

GetPayAmount := RoundPay(OvertimeThreshold * Rate +

Rate OverTime * OvertimeFactor * Rate)

else

GetPayAmount := RoundPay(Time * Rate)

end;

A person who is paid an hourly rate is a worker: he has everything that is used to define the TEmployee object (name, position, rate), and only the amount of money received by the hourly person depends on how many hours he worked during the period payable. Thus, THourly also requires a Time field.

Because THourly defines a new Time field, its initialization requires a new Init method that initializes both the time and the inherited fields. Instead of directly assigning values ​​to inherited fields such as Name, Title, and Rate, why not reuse the TEmployee object's initialization method (illustrated by the first THourly Init statement).

Calling a method that is being overridden is not the best style. In general, it is possible that TEmployee.Init performs an important but hidden initialization.

When calling an overridden method, you must be sure that the derived object type includes the parent's functionality. In addition, any change in the parent method automatically affects all child methods.

After calling TEmployee.Init, THourly.Init can then perform its own initialization, which in this case only consists of assigning the value passed in ATime.

Another example of an overridden method is the THourly.GetPayAmount function, which calculates the payout amount for an hourly employee. In fact, each type of TEmployee object has its own GetPayAmount method, since the type of worker depends on how the calculation is made. The THourly.GetPayAmount method should take into account how many hours the employee worked, whether there was overtime, what the increase factor for overtime was, etc.

TSalaried method. GetPayAmount should only divide the employee's rate by the number of payments in each year (in our example).

unit workers;

interface

const

PayPeriods = 26; {in year}

Overtime Threshold = 80; {for each payment period}

OvertimeFactor=1.5; {increase against normal payment}

type

Temployee = object

Name, Title: string[25];

Rate: Real;

procedure Init(AName, ATitle: string; ARate: Real);

function GetName : String;

function GetTitle : String;

function GetRate : Real;

function GetPayAmount: Real;

end;

THourly = object(TEmployee)

Time: Integer;

procedure Init(AName, ATitle: string; ARate:

Real, Atime: Integer);

function GetPayAmount: Real;

function GetTime: Real;

end;

TSalaried = object(TEmployee)

function GetPayAmount: Real;

end;

TCommissioned = object(TSalaried)

Commission: Real;

Sales Amount : Real;

constructor Init(AName, ATitle: String; ARate,

ACommission, ASalesAmount: Real);

function GetPayAmount: Real;

end;

implementation

function RoundPay(Wages: Real) : Real;

{round up payouts to ignore amounts less than

monetary unit}

begin

RoundPay := Trunc(Wages * 100) / 100;

.

.

.

TEmployee is the top of our object hierarchy and contains the first GetPayAmount method.

function TEmployee.GetPayAmount : Real;

begin

RunError(211); { give runtime error }

end;

It may come as a surprise that the method gives a run-time error. If Employee.GetPayAmount is called, an error occurs in the program. Why? Because TEmployee is the top of our object hierarchy and doesn't define a real worker; therefore, none of the TEmployee methods are called in a particular way, although they may be inherited. All of our employees are either hourly, salaried or piecework. A run-time error terminates program execution and outputs 211, which corresponds to an error message associated with an abstract method call (if the program calls TEmployee.GetPayAmount by mistake).

Below is the THourly.GetPayAmount method, which takes into account things like overtime pay, hours worked, etc.

function THourly.GetPayAMount : Real;

var

OverTime: Integer;

begin

Overtime := Time - OvertimeThreshold;

if Overtime > 0 then

GetPayAmount := RoundPay(OvertimeThreshold * Rate +

Rate OverTime * OvertimeFactor * Rate)

else

GetPayAmount := RoundPay(Time * Rate)

end;

The TSalaried.GetPayAmount method is much simpler; bet in it

divided by the number of payments:

function TSalaried.GetPayAmount : Real;

begin

GetPayAmount := RoundPay(Rate / PayPeriods);

end;

If you look at the TCommissioned.GetPayAmount method, you'll see that it calls TSalaried.GetPayAmount, calculates the commission, and adds it to the value returned by the TSalaried method. GetPayAmount.

function TCommissioned.GetPayAmount : Real;

begin

GetPayAmount := RoundPay(TSalaried.GetPayAmount +

Commission * Sales Amount);

end;

Important note: While methods can be overridden, data fields cannot be overridden. Once a data field has been defined in an object hierarchy, no child type can define a data field with exactly the same name.

3. Compatibility of object types

Inheritance modifies Borland Pascal's type compatibility rules to some extent. Among other things, a derived type inherits the type compatibility of all its parent types.

This extended type compatibility takes three forms:

1) between implementations of objects;

2) between pointers to object implementations;

3) between formal and actual parameters.

However, it is very important to remember that in all three forms, type compatibility only extends from child to parent. In other words, child types can be freely used in place of parent types, but not vice versa.

For example, TSalaried is a child of TEmployee and TSosh-missioned is a child of TSalaried. With that in mind, consider the following descriptions:

type

PEmployee = ^TEmployee;

PSalaried = ^TSalaried;

PCommissioned = ^TCommissioned;

var

AnEmployee: TEmployee;

ASalaried: TSalaried;

PCommissioned: TCommissioned;

TEmployeePtr: PEmployee;

TSalariedPtr: PSalaried;

TCommissionedPtr: PCommissioned;

Under these descriptions, the following operators are valid

assignments:

AnEmployee :=ASalaried;

ASalaried := ACommissioned;

TCommissionedPtr := ACommissioned;

Note

A parent object can be assigned an instance of any of its derived types. Back assignments are not allowed.

This concept is new to Pascal, and at first it might be hard to remember what order type compatibility comes in. You need to think like this: the source must be able to completely fill the receiver. Derived types contain everything that their parent types contain due to the property of inheritance. Therefore, the derived type is either exactly the same size, or (which is most often the case) it is larger than its parent, but never smaller. Assigning a parent (parent) object to a child (child) could leave some fields of the child object undefined, which is dangerous and therefore illegal.

In assignment statements, only fields that are common to both types will be copied from the source to the destination. In the assignment operator:

AnEmployee:= ACommissioned;

Only the Name, Title, and Rate fields from ACommissioned will be copied to AnEmployee, since these are the only fields that are common to TCommissioned and TEmployee. Type compatibility also works between pointers to object types, and follows the same general rules as for object implementations. A pointer to a child can be assigned to a pointer to the parent. Given the previous definitions, the following pointer assignments are valid:

TSalariedPtr:= TCommissionedPtr;

TEmployeePtr:= TSalariedPtr;

TEmployeePtr:= PCommissionedPtr;

Remember that reverse assignments are not allowed!

A formal parameter (either a value or a variable parameter) of a given object type can take as its actual parameter an object of its own type or objects of all child types. If you define a procedure header like this:

procedure CalcFedTax(Victim: TSalaried);

then the actual parameter types can be TSalaried or TCommissioned, but not TEmployee. Victim can also be a variable parameter. In this case, the same compatibility rules are followed.

Comment

There is a fundamental difference between value parameters and variable parameters. A value parameter is a pointer to the actual object passed as a parameter, while a variable parameter is just a copy of the actual parameter. Moreover, this copy includes only those fields that are included in the type of the formal value parameter. This means that the actual parameter is literally converted to the type of the formal parameter. A variable parameter is more like casting to a pattern, in the sense that the actual parameter remains unchanged.

Similarly, if the formal parameter is a pointer to an object type, the actual parameter can be a pointer to that object type or to any child type. Let the title of the procedure be given:

procedure Worker.Add(AWorker: PSalared);

Valid actual parameter types would then be PSalaried or PCommissioned, but not PEmployee.

LECTURE No. 14. Assembler

1. About assembler

Once upon a time, assembler was a language without knowing which it was impossible to make a computer do anything useful. Gradually the situation changed. More convenient means of communication with a computer appeared. But unlike other languages, assembler did not die; moreover, it could not do this in principle. Why? In search of an answer, we will try to understand what assembly language is in general.

In short, assembly language is a symbolic representation of machine language. All processes in the machine at the lowest, hardware level are driven only by commands (instructions) of the machine language. From this it is clear that, despite the common name, the assembly language for each type of computer is different. This also applies to the appearance of programs written in assembler, and the ideas that this language is a reflection of.

Really solving hardware-related problems (or, even more, hardware-related ones, such as speeding up a program, for example) is impossible without knowledge of assembler.

A programmer or any other user can use any high-level tools up to programs for building virtual worlds and, perhaps, not even suspect that the computer is actually executing not the commands of the language in which its program is written, but their transformed representation in the form of a boring and dull sequence commands of a completely different language - machine language. Now imagine that such a user has a non-standard problem. For example, his program must work with some unusual device or perform other actions that require knowledge of the principles of computer hardware. No matter how good the language in which the programmer wrote his program is, he cannot do without knowing the assembler. And it is no coincidence that almost all compilers of high-level languages ​​contain means of connecting their modules with modules in assembler or support access to the assembler programming level.

A computer is made up of several physical devices, each of which is connected to one unit, called the system unit. To understand their functional purpose, let's look at the block diagram of a typical computer (Fig. 1). It does not pretend to absolute accuracy and aims only to show the purpose, relationship and typical composition of the elements of a modern personal computer.

Rice. 1. Structural diagram of a personal computer

2. Software model of the microprocessor

In today's computer market, there is a wide variety of different types of computers. Therefore, it is possible to assume that the consumer will have a question about how to evaluate the capabilities of a particular type (or model) of a computer and its distinctive features from computers of other types (models). To bring together all the concepts that characterize a computer in terms of its functional program-controlled properties, there is a special term - computer architecture. For the first time, the concept of computer architecture began to be mentioned with the advent of third-generation machines for their comparative evaluation.

It makes sense to start learning the assembly language of any computer only after finding out what part of the computer is left visible and available for programming in this language. This is the so-called computer program model, part of which is the microprocessor program model, which contains thirty-two registers, more or less available for use by the programmer.

These registers can be divided into two large groups:

1) 6 user registers;

2) 16 system registers.

3. User registers

As the name implies, user registers are called because the programmer can use them when writing his programs. These registers include (Fig. 2):

1) eight 32-bit registers that can be used by programmers to store data and addresses (they are also called general purpose registers (RON)):

eax/ax/ah/al;

ebx/bx/bh/bl;

edx/dx/dh/dl;

ecx/cx/ch/cl;

ebp/bp;

esi/si;

edi/di;

esp/sp.

2) six segment registers: cs, ds, ss, es, fs, gs;

3) status and control registers:

flag register eflags/flags;

eip/ip command pointer register.

Rice. 2. User registers

Many of these registers are given with a slash. These are not different registers - they are parts of one large 32-bit register. They can be used in the program as separate objects.

4. General registers

All registers of this group allow you to access their "lower" parts. Only the lower 16- and 8-bit parts of these registers can be used for self-addressing. The upper 16 bits of these registers are not available as independent objects.

Let's list the registers belonging to the group of general purpose registers. Since these registers are physically located in the microprocessor inside the arithmetic logic unit (AL>), they are also called ALU registers:

1) eax/ax/ah/al (Accumulator register) - battery. Used to store intermediate data. In some commands, the use of this register is required;

2) ebx/bx/bh/bl (Base register) - base register. Used to store the base address of some object in memory;

3) ecx/cx/ch/cl (Count register) - counter register. It is used in commands that perform some repetitive actions. Its use is often implicit and hidden in the algorithm of the corresponding command.

For example, the command for organizing the loop loop, in addition to transferring control to a command located at a certain address, analyzes and decrements the value of the esx/cx register by one;

4) edx/dx/dh/dl (Data register) - data register.

Just like the eax/ax/ah/al register, it stores intermediate data. Some commands require its use; for some commands this happens implicitly.

The following two registers are used to support the so-called chain operations, i.e. operations that sequentially process chains of elements, each of which can be 32, 16 or 8 bits long:

1) esi/si (Source Index register) - source index.

This register in chain operations contains the current address of the element in the source chain;

2) edi/di (Destination Index register) - index of the receiver (recipient). This register in chain operations contains the current address in the destination chain.

In the architecture of the microprocessor at the hardware and software level, such a data structure as a stack is supported. To work with the stack in the microprocessor instruction system there are special commands, and in the microprocessor software model there are special registers for this:

1) esp/sp (Stack Pointer register) - stack pointer register. Contains a pointer to the top of the stack in the current stack segment.

2) ebp/bp (Base Pointer register) - stack frame base pointer register. Designed to organize random access to data inside the stack.

The use of hard pinning of registers for some instructions makes it possible to encode their machine representation more compactly. Knowing these features will, if necessary, save at least a few bytes of memory occupied by the program code.

5. Segment registers

There are six segment registers in the microprocessor software model: cs, ss, ds, es, gs, fs.

Their existence is due to the specifics of the organization and use of RAM by Intel microprocessors. It lies in the fact that the microprocessor hardware supports the structural organization of the program in the form of three parts, called segments. Accordingly, such an organization of memory is called segmented.

In order to indicate the segments to which the program has access at a particular point in time, segment registers are intended. In fact (with a slight correction) these registers contain the memory addresses from which the corresponding segments begin. The logic of processing a machine instruction is constructed in such a way that addresses in well-defined segment registers are implicitly used when fetching an instruction, accessing program data or accessing the stack.

The microprocessor supports the following types of segments.

1. Code segment. Contains program commands. To access this segment, the cs register (code segment register) is used - the segment code register. It contains the address of the machine instruction segment that the microprocessor has access to (i.e., these instructions are loaded into the microprocessor pipeline).

2. Data segment. Contains the data processed by the program. To access this segment, the ds (data segment register) register is used - a segment data register that stores the address of the data segment of the current program.

3. Stack segment. This segment is a region of memory called the stack. The microprocessor organizes work with the stack according to the following principle: the last element written to this area is selected first. To access this segment, the ss register (stack segment register) is used - the stack segment register containing the address of the stack segment.

4. Additional data segment. Implicitly, the algorithms for executing most machine instructions assume that the data they process is located in the data segment, the address of which is in the ds segment register. If one data segment is not enough for the program, then it has the opportunity to use three more additional data segments. But unlike the main data segment, whose address is contained in the ds segment register, when using additional data segments, their addresses must be specified explicitly using special segment redefinition prefixes in the command. Addresses of additional data segments must be contained in the registers es, gs, fs (extension data segment registers).

6. Status and control registers

The microprocessor includes several registers that constantly contain information about the state of both the microprocessor itself and the program whose instructions are currently loaded onto the pipeline. These registers include:

1) flag register eflags/flags;

2) eip/ip command pointer register.

Using these registers, you can get information about the results of command execution and influence the state of the microprocessor itself. Let us consider in more detail the purpose and contents of these registers.

1. eflags/flags (flag register) - flag register. The bit depth of eflags/flags is 32/16 bits. Individual bits of this register have a specific functional purpose and are called flags. The lower part of this register is exactly the same as the flags register for 18086. Figure 3 shows the contents of the eflags register.

Rice. 3. The contents of the eflags register

Depending on how they are used, the flags of the eflags/flags register can be divided into three groups:

1) eight status flags.

These flags may change after machine instructions have been executed. The status flags of the eflags register reflect the specifics of the result of the execution of arithmetic or logical operations. This makes it possible to analyze the state of the computational process and respond to it using conditional jump commands and subroutine calls. Table 1 lists the status flags and their purpose.

2) one control flag.

Denoted df (Directory Flag). It is located in bit 10 of the eflags register and is used by chained commands. The value of the df flag determines the direction of element-by-element processing in these operations: from the beginning of the string to the end (df = 0) or vice versa, from the end of the string to its beginning (df = 1). There are special commands for working with the df flag: eld (remove the df flag) and std (set the df flag). The use of these commands allows you to adjust the df flag in accordance with the algorithm and ensure that counters are automatically incremented or decremented when performing operations on strings.

3) five system flags.

Controls I/O, maskable interrupts, debugging, task switching, and 8086 virtual mode. It is not recommended for application programs to modify these flags unnecessarily, as this will cause the program to terminate in most cases. Table 2 lists the system flags and their purpose.

Table 1. Status flagsTable 2. System flags

2. eip/ip (Instraction Pointer register) - instruction pointer register. The eip/ip register is 32/16 bits wide and contains the offset of the next instruction to be executed relative to the contents of the cs segment register in the current instruction segment. This register is not directly accessible to the programmer, but its value is loaded and changed by various control commands, which include commands for conditional and unconditional jumps, calling procedures, and returning from procedures. The occurrence of interrupts also modifies the eip/ip register.

LECTURE No. 15. Registers

1. Microprocessor system registers

The very name of these registers suggests that they perform specific functions in the system. The use of system registers is strictly regulated. It is they who provide the protected mode. They can also be thought of as part of the microprocessor architecture, which is deliberately left visible so that a qualified system programmer can perform the most low-level operations.

System registers can be divided into three groups:

1) four control registers;

2) four registers of system addresses;

3) eight debug registers.

2. Control registers

The group of control registers includes four registers: cr0, cr1, cr2, cr3. These registers are for general system control. Control registers are only available to programs with privilege level 0.

Although the microprocessor has four control registers, only three of them are available - cr1 is excluded, the functions of which are not yet defined (it is reserved for future use).

The cr0 register contains system flags that control the modes of operation of the microprocessor and reflect its state globally, regardless of the specific tasks being performed.

Purpose of system flags:

1) pe (Protect Enable), bit 0 - enable the protected mode of operation. The state of this flag shows in which of the two modes - real (pe = 0) or protected (pe = 1) - the microprocessor is operating at a given time;

2) mp (Math Present), bit 1 - the presence of a coprocessor. Always 1;

3) ts (Task Switched), bit 3 - task switching. The processor automatically sets this bit when it switches to another task;

4) am (Alignment Mask), bit 18 - alignment mask. This bit enables (am = 1) or disables (am = 0) alignment control;

5) cd (Cache Disable), bit 30 - disable cache memory.

Using this bit, you can disable (cd =1) or enable (cd = 0) the use of the internal cache (the first level cache);

6) pg (PaGing), bit 31 - enable (pg =1) or disable (pg = 0) paging.

The flag is used in the paging model of memory organization.

The cr2 register is used in RAM paging to register the situation when the current instruction accessed the address contained in a memory page that is currently not in memory.

In such a situation, an exception number 14 occurs in the microprocessor, and the linear 32-bit address of the instruction that caused this exception is written to register cr2. With this information, the exception handler 14 determines the desired page, swaps it into memory and resumes the normal operation of the program;

The cr3 register is also used for paging memory. This is the so-called first-level page directory register. It contains the 20-bit physical base address of the current task's page directory. This directory contains 1024 32-bit descriptors, each of which contains the address of the second level page table. In turn, each of the second level page tables contains 1024 32-bit descriptors that address page frames in memory. The page frame size is 4 KB.

3. Registers of system addresses

These registers are also called memory management registers.

They are designed to protect programs and data in the multitasking mode of the microprocessor. When operating in microprocessor protected mode, the address space is divided into:

1) global - common to all tasks;

2) local - separate for each task.

This separation explains the presence of the following system registers in the microprocessor architecture:

1) the register of the global descriptor table gdtr (Global Descriptor Table Register), having a size of 48 bits and containing a 32-bit (bits 16-47) base address of the global descriptor table GDT and a 16-bit (bits 0-15) limit value, which is size in bytes of the GDT table;

2) the local descriptor table register ldtr (Local Descriptor Table Register), having a size of 16 bits and containing the so-called selector of the descriptor of the local descriptor table LDT This selector is a pointer in the GDT table, which describes the segment containing the local descriptor table LDT;

3) the register of the interrupt descriptor table idtr (Interrupt Descriptor Table Register), having a size of 48 bits and containing a 32-bit (bits 16-47) base address of the IDT interrupt descriptor table and a 16-bit (bits 0-15) limit value, which is size in bytes of the IDT table;

4) 16-bit task register tr (Task Register), which, like the ldtr register, contains a selector, i.e. a pointer to a descriptor in the GDT table. This descriptor describes the current Task Segment Status (TSS). This segment is created for each task in the system, has a strictly regulated structure and contains the context (current state) of the task. The main purpose of TSS segments is to save the current state of a task at the moment of switching to another task.

4. ​​Debug registers

This is a very interesting group of registers intended for hardware debugging. Hardware debugging tools first appeared in the i486 microprocessor. In hardware, the microprocessor contains eight debug registers, but only six of them are actually used.

Registers dr0, dr1, dr2, dr3 have a width of 32 bits and are designed to set the linear addresses of four breakpoints. The mechanism used in this case is the following: any address generated by the current program is compared with the addresses in registers dr0... dr3, and if there is a match, a debugging exception with number 1 is generated.

Register dr6 is called the debug status register. The bits in this register are set according to the reasons that caused the last exception number 1 to occur.

We list these bits and their purpose:

1) b0 - if this bit is set to 1, then the last exception (interrupt) occurred as a result of reaching the checkpoint defined in register dr0;

2) b1 - similar to b0, but for a checkpoint in register dr1;

3) b2 - similar to b0, but for a checkpoint in register dr2;

4) bЗ - similar to b0, but for a checkpoint in register dr3;

5) bd (bit 13) - serves to protect the debug registers;

6) bs (bit 14) - set to 1 if exception 1 was caused by the state of the flag tf = 1 in the eflags register;

7) bt (bit 15) is set to 1 if exception 1 was caused by a switch to a task with the trap bit set in TSS t = 1.

All other bits in this register are filled with zeros. Exception handler 1, based on the contents of dr6, must determine the reason for the exception and take the necessary actions.

Register dr7 is called the debug control register. It contains fields for each of the four debug breakpoint registers that allow you to specify the following conditions under which an interrupt should be generated:

1) checkpoint registration location - only in the current task or in any task. These bits occupy the lower 8 bits of register dr7 (2 bits for each breakpoint (actually a breakpoint) set by registers dr0, dr1, dr2, dr3, respectively).

The first bit of each pair is the so-called local resolution; setting it tells the breakpoint to take effect if it is within the current task's address space.

The second bit in each pair specifies the global permission, which indicates that the given breakpoint is valid within the address spaces of all tasks in the system;

2) the type of access by which the interrupt is initiated: only when fetching a command, when writing, or when writing / reading data. The bits that determine this nature of the occurrence of an interrupt are located in the upper part of this register. Most of the system registers are programmatically accessible.

LECTURE No. 16. Assembler Programs

1. The structure of the program in assembler

An assembly language program is a collection of blocks of memory called memory segments. A program may consist of one or more of these block-segments. Each segment contains a collection of language sentences, each of which occupies a separate line of program code.

Assembly statements are of four types:

1) commands or instructions, which are symbolic analogues of machine commands. During the translation process, assembly instructions are converted into the corresponding commands of the microprocessor instruction set;

2) macros. These are sentences of the text of the program that are formalized in a certain way and are replaced by other sentences during the broadcast;

3) directives, which are an indication to the assembler translator to perform certain actions. Directives have no counterparts in machine representation;

4) comment lines containing any characters, including letters of the Russian alphabet. Comments are ignored by the translator.

2. Assembly Syntax

The sentences that make up a program can be a syntactic construct corresponding to a command, macro, directive, or comment. In order for the assembler translator to recognize them, they must be formed according to certain syntactic rules. To do this, it is best to use a formal description of the syntax of the language, like the rules of grammar. The most common ways to describe a programming language in this way are syntax diagrams and extended Backus-Naur forms. For practical use, syntax diagrams are more convenient. For example, the syntax of assembly language statements can be described using the syntax diagrams shown in the following figures.

Rice. 4. Assembler sentence format

Rice. 5. Directive Format

Rice. 6. Format of commands and macros

On these drawings:

1) label name - an identifier, the value of which is the address of the first byte of the sentence of the source code of the program that it denotes;

2) name - an identifier that distinguishes this directive from other directives of the same name. As a result of processing by the assembler of a certain directive, certain characteristics can be assigned to this name;

3) an operation code (COP) and a directive are mnemonic designations of the corresponding machine instruction, macroinstruction or translator directive;

4) operands - parts of the command, macro or assembler directives, denoting objects on which operations are performed. Assembler operands are described by expressions with numeric and text constants, variable labels and identifiers using operator signs and some reserved words.

How to use syntax diagrams? It's very simple: all you need to do is find and then follow the path from the diagram's input (left) to its output (right). If such a path exists, then the sentence or construction is syntactically correct. If there is no such path, then the compiler will not accept this construction. When working with syntax diagrams, pay attention to the direction of the traversal indicated by the arrows, as among the paths there may be those that can be followed from right to left. In fact, syntactic diagrams reflect the logic of the translator when parsing the input sentences of the program.

Allowed characters when writing the text of programs are:

1) all Latin letters: A - Z, a - z. In this case, uppercase and lowercase letters are considered equivalent;

2) numbers from 0 to 9;

3) signs ?, @, S, _, &;

4) separators.

Assembler sentences are formed from lexemes, which are syntactically inseparable sequences of valid language symbols that make sense for the translator.

The tokens are as follows.

1. Identifiers are sequences of valid characters used to designate program objects such as operation codes, variable names, and label names. The rule for writing identifiers is as follows: an identifier can consist of one or more characters. As characters, you can use letters of the Latin alphabet, numbers and some special characters - _, ?, $, @. An identifier cannot start with a digit character. The length of the identifier can be up to 255 characters, although the translator accepts only the first 32 and ignores the rest. You can adjust the length of possible identifiers using the mv command line option. In addition, it is possible to tell the translator to distinguish between uppercase and lowercase letters or ignore their difference (which is done by default). The /mu, /ml, /mx command line options are used for this.

2. Chains of characters - sequences of characters enclosed in single or double quotes.

3. Integers in one of the following number systems: binary, decimal, hexadecimal. Identification of numbers when writing them in assembler programs is carried out according to certain rules:

1) decimal numbers do not require any additional characters to be identified, for example, 25 or 139;

2) to identify binary numbers in the source text of the program, it is necessary to put the Latin "b" after writing the zeros and ones that make up them, for example, 10010101 b;

3) Hexadecimal numbers have more conventions when writing:

a) firstly, they consist of the numbers 0...9, lowercase and uppercase letters of the Latin alphabet a, b, c, d, e, Gili D B, C, D, E, E

b) secondly, the translator may have difficulty recognizing hexadecimal numbers due to the fact that they can consist of only digits 0...9 (for example, 190845) or begin with a letter of the Latin alphabet (for example, efl5 ). In order to “explain” to the translator that a given token is not a decimal number or an identifier, the programmer must highlight the hexadecimal number in a special way. To do this, write the Latin letter “h” at the end of the sequence of hexadecimal digits that make up a hexadecimal number. This is a must. If a hexadecimal number begins with a letter, then a leading zero is written before it: 0 efl5 h.

Thus, we figured out how the sentences of an assembler program are constructed. But this is only the most superficial view.

Almost every sentence contains a description of the object on which or with the help of which some action is performed. These objects are called operands. They can be defined as follows: operands are objects (some values, registers or memory cells) that are affected by instructions or directives, or they are objects that define or refine the action of instructions or directives.

Operands can be combined with arithmetic, logical, bitwise, and attribute operators to calculate some value or determine a memory location that will be affected by a given command or directive.

Let us consider in more detail the characteristics of the operands in the following classification:

1) constant or immediate operands - a number, string, name or expression that has some fixed value. The name must not be relocatable, that is, it must not depend on the address of the program to be loaded into memory. For example, it can be defined with the equal or = operators;

2) address operands, set the physical location of the operand in memory by specifying two components of the address: segment and offset (Fig. 7);

Rice. 7. Syntax of description of address operands

3) relocatable operands - any symbolic names representing some memory addresses. These addresses may indicate the memory location of some instructions (if the operand is a label) or data (if the operand is the name of a memory location in the data segment).

Relocatable operands differ from address operands in that they are not tied to a specific physical memory address. The segment component of the address of the operand being moved is unknown and will be determined after the program is loaded into memory for execution.

The address counter is a specific kind of operand. It is denoted by the sign S. The specificity of this operand is that when the assembler translator encounters this symbol in the source program, it substitutes the current value of the address counter instead. The value of the address counter, or placement counter as it is sometimes called, is the offset of the current machine instruction from the beginning of the code segment. In the listing format, the second or third column corresponds to the address counter (depending on whether the column with the nesting level is present or not in the listing). If we take any listing as an example, then it is clear that when the translator processes the next assembler instruction, the address counter increases by the length of the generated machine instruction. It is important to understand this point correctly. For example, processing assembler directives does not change the counter. Directives, unlike assembler commands, are only instructions to the compiler to perform certain actions to form the machine representation of the program, and for them the compiler does not generate any constructs in memory.

When using such an expression to jump, be aware of the length of the instruction itself in which this expression is used, since the value of the address counter corresponds to the offset in the instruction segment of this instruction, and not of the instruction following it. In our example, the jmp command takes 2 bytes. But be careful, the length of an instruction depends on what operands it uses. An instruction with register operands will be shorter than an instruction with one of its operands located in memory. In most cases, this information can be obtained by knowing the format of the machine instruction and by analyzing the listing column with the instruction's object code;

4) register operand is just a register name. In an assembler program, you can use the names of all general purpose registers and most system registers;

5) base and index operands. This operand type is used to implement indirect base, indirect index addressing, or combinations and extensions thereof;

6) Structural operands are used to access a specific element of a complex data type called a structure.

Records (similar to a struct type) are used to access a bit field of some record.

Operands are elementary components that form part of a machine instruction, denoting the objects on which the operation is performed. In a more general case, operands can be included as components in more complex formations called expressions. Expressions are combinations of operands and operators considered as a whole. The result of expression evaluation can be the address of some memory cell or some constant (absolute) value.

We have already considered the possible types of operands. We now list the possible types of assembler operators and the syntactic rules for the formation of assembler expressions, and give a brief description of the operators.

1. Arithmetic operators. These include:

1) unary "+" and "-";

2) binary "+" and "-";

3) multiplication "*";

4) integer division "/";

5) getting the remainder from the division "mod".

These operators are located at precedence levels 6,7,8 in Table 4.

Rice. 8. Syntax of arithmetic operations

2. Shift operators shift the expression by the specified number of bits (Fig. 9).

Rice. 9. Syntax of shift operators

3. Comparison operators (return the value "true" or "false") are intended for the formation of logical expressions (Fig. 10 and Table 3). The logical value "true" corresponds to a digital unit, and "false" - to zero.

Rice. 10. Syntax of comparison operators

Table 3. Comparison operators

4. Logical operators perform bitwise operations on expressions (Fig. 11). Expressions must be absolute, i.e., such, the numerical value of which can be calculated by the translator.

Rice. 11. Syntax of logical operators

5. Index operator []. Parentheses are also an operator, and the translator perceives their presence as an instruction to add the value of expression_1 behind these brackets with expression_2 enclosed in brackets (Fig. 12).

Rice. 12. Index operator syntax

Note that the following designation is adopted in the literature on assembler: when the text refers to the contents of a register, its name is taken in parentheses. We will also adhere to this notation.

6. The ptr type redefinition operator is used to redefine or qualify the type of a label or variable defined by an expression (Fig. 13).

The type can take one of the following values: byte, word, dword, qword, tbyte, near, far.

Rice. 13. Syntax of type redefinition operator

7. The segment redefinition operator ":" (colon) forces the calculation of a physical address relative to a specifically specified segment component: "segment register name", "segment name" from the corresponding SEGMENT directive, or "group name" (Fig. 14). When discussing segmentation, we talked about the fact that the microprocessor at the hardware level supports three types of segments - code, stack and data. What is this hardware support? For example, to select the execution of the next command, the microprocessor must necessarily look at the contents of the segment register cs and only it. And this register, as we know, contains the (not yet shifted) physical address of the beginning of the instruction segment. To get the address of a particular instruction, the microprocessor needs to multiply the contents of cs by 16 (which means a shift by four bits) and add the resulting 20-bit value to the 16-bit contents of the ip register. Approximately the same thing happens when the microprocessor processes the operands in the machine instruction. If it sees that the operand is an address (an effective address that is only part of the physical address), then it knows which segment to look for it in - by default, it is the segment whose start address is stored in segment register ds.

But what about the stack segment? In the context of our consideration, we are interested in the sp and bp registers. If the microprocessor sees one of these registers as an operand (or part of it, if the operand is an expression), then by default it forms the physical address of the operand, using the contents of the ss register as its segment component. This is a set of microprograms in the microprogram control unit, each of which executes one of the instructions in the microprocessor machine instruction system. Each microprogram works according to its own algorithm. Of course, you cannot change it, but you can slightly correct it. This is done using the optional machine command prefix field. If we agree on how the command works, then this field is missing. If we want to make an amendment (if, of course, it is permissible for a particular command) to the algorithm of the command, then it is necessary to form the appropriate prefix.

A prefix is ​​a one-byte value whose numerical value determines its purpose. The microprocessor recognizes by the specified value that this byte is a prefix, and further work of the microprogram is performed taking into account the received instruction to correct its work. Now we are interested in one of them - the segment replacement (redefinition) prefix. Its purpose is to indicate to the microprocessor (and in fact, the firmware) that we do not want to use the default segment. The possibilities for such a redefinition are, of course, limited. The command segment cannot be redefined, the address of the next executable command is uniquely determined by the cs: ip pair. And here segments of a stack and data - it is possible. That's what the ":" operator is for. The assembler translator, processing this statement, generates the corresponding one-byte segment replacement prefix.

Rice. 14. Syntax of the segment redefinition operator

8. The structure type naming operator "."(dot) also forces the compiler to perform certain calculations if it occurs in an expression.

9. The operator for obtaining the segment component of the address of the expression seg returns the physical address of the segment for the expression (Fig. 15), which can be a label, variable, segment name, group name, or some symbolic name.

Rice. 15. Syntax of the segment component receiving operator

10. The operator for obtaining the offset of the expression offset allows you to get the value of the offset of the expression (Fig. 16) in bytes relative to the beginning of the segment in which the expression is defined.

Rice. 16. Syntax of the offset get operator

As in high-level languages, the execution of assembler operators when evaluating expressions is carried out in accordance with their priorities (Table 4). Operations with the same priority are executed sequentially from left to right. Changing the order of execution is possible by placing parentheses that have the highest precedence.

Table 4. Operators and their precedence

3. Segmentation Directives

In the course of the previous discussion, we found out all the basic rules for writing instructions and operands in an assembly language program. The question of how to properly format the sequence of commands so that the translator can process them and the microprocessor can execute them remains open.

When considering the architecture of the microprocessor, we learned that it has six segment registers, through which it can work simultaneously:

1) with one code segment;

2) with one stack segment;

3) with one data segment;

4) with three additional data segments.

Recall again that a segment is physically a memory area occupied by commands and (or) data whose addresses are calculated relative to the value in the corresponding segment register.

The syntactic description of a segment in assembler is the construction shown in Figure 17:

Rice. 17. Segment description syntax

It is important to note that the functionality of a segment is somewhat broader than simply breaking the program into blocks of code, data, and stack. Segmentation is part of a more general mechanism related to the concept of modular programming. It involves the unification of the design of object modules created by the compiler, including those from different programming languages. This allows you to combine programs written in different languages. It is for the implementation of various options for such a union that the operands in the SEGMENT directive are intended.

Consider them in more detail.

1. The segment alignment attribute (alignment type) tells the linker to ensure that the beginning of the segment is placed on the specified boundary. This is important because proper alignment makes data access faster on i80x86 processors. Valid values ​​for this attribute are as follows:

1) BYTE - alignment is not performed. A segment can start at any memory address;

2) WORD - the segment starts at an address that is a multiple of two, i.e. the last (least significant) bit of the physical address is 0 (aligned to the word boundary);

3) DWORD - the segment starts at an address that is a multiple of four, i.e. the last two (least significant) bits are 0 (double word boundary alignment);

4) PARA - the segment starts at an address that is a multiple of 16, i.e. the last hexadecimal digit of the address must be Oh (alignment to the paragraph boundary);

5) PAGE - the segment starts at an address that is a multiple of 256, i.e. the last two hexadecimal digits must be 00h (aligned to the boundary of a 256-byte page);

6) MEMPAGE - the segment starts at an address that is a multiple of 4 KB, i.e. the last three hexadecimal digits must be OOOh (the address of the next 4 KB memory page). The default alignment type is PARA.

2. The combine segment attribute (combinatorial type) tells the linker how to combine segments of different modules that have the same name. The segment combination attribute values ​​can be:

1) PRIVATE - the segment will not be merged with other segments with the same name outside of this module;

2) PUBLIC - causes the linker to connect all segments with the same name. The new merged segment will be whole and continuous. All addresses (offsets) of objects, and this may depend on the type of command and data segment, will be calculated relative to the beginning of this new segment;

3) COMMON - places all segments with the same name at the same address. All segments with the given name will overlap and share memory. The size of the resulting segment will be equal to the size of the largest segment;

4) AT xxxx - locates the segment at the absolute address of the paragraph (paragraph is the amount of memory, a multiple of 16; therefore, the last hexadecimal digit of the paragraph address is 0). The absolute address of a paragraph is given by xxx. The linker places the segment at a given memory address (this can be used, for example, to access video memory or a ROM> area), taking into account the combining attribute. Physically, this means that the segment, when loaded into memory, will be located starting from this absolute address of the paragraph, but to access it, the value specified in the attribute must be loaded into the corresponding segment register. All labels and addresses in a segment thus defined are relative to the given absolute address;

5) STACK - definition of a stack segment. Causes the linker to connect all segments with the same name and calculate the addresses in these segments relative to the ss register. The combined type STACK (stack) is similar to the combined type PUBLIC, except that the ss register is the standard segment register for stack segments. The sp register is set to the end of the concatenated stack segment. If no stack segment is specified, the linker will issue a warning that no stack segment was found. If a stack segment is created and the combined STACK type is not used, the programmer must explicitly load the segment address into the ss register (similar to the ds register).

The combination attribute defaults to PRIVATE.

3. A segment class attribute (class type) is a quoted string that helps the linker determine the appropriate segment order when assembling a program from multiple module segments. The linker combines all segments with the same class name together in memory (the class name can generally be anything, but it is better if it reflects the functionality of the segment). A typical use of a class name is to group together all the code segments of a program (usually the "code" class is used for this). Using the class typing mechanism, you can also group initialized and uninitialized data segments.

4. Segment size attribute. For i80386 and higher processors, segments can be 16-bit or 32-bit. This primarily affects the size of the segment and the order in which the physical address is formed within it. The attribute can take the following values:

1) USE16 - this means that the segment allows 16-bit addressing. When forming a physical address, only a 16-bit offset can be used. Accordingly, such a segment can contain up to 64 KB of code or data;

2)USE32 - the segment will be 32-bit. When forming a physical address, a 32-bit offset can be used. Therefore, such a segment can contain up to 4 GB of code or data.

All segments are equal in themselves, since the SEGMENT and ENDS directives do not contain information about the functional purpose of the segments. In order to use them as code, data, or stack segments, it is necessary to inform the translator about this in advance, for which a special ASSUME directive is used, which has the format shown in Fig. 18. This directive tells the translator which segment is bound to which segment register. In turn, this will allow the translator to correctly bind the symbolic names defined in the segments. Binding of segments to segment registers is carried out using the operands of this directive, in which the segment_name must be the name of the segment, defined in the source text of the program by the SEGMENT directive or the nothing keyword. If only the keyword nothing is used as an operand, then the previous segment register assignments are canceled, and for all six segment registers at once. But the keyword nothing can be used instead of the segment name argument; in this case, the connection between the segment with the name segment name and the corresponding segment register will be selectively broken (see Fig. 18).

Rice. 18. ASSUME Directive

For simple programs containing one segment for code, data, and stack, we would like to simplify its description. To do this, the translators MASM and TASM introduced the ability to use simplified segmentation directives. But here a problem arose related to the fact that it was necessary to somehow compensate for the inability to directly control the placement and combination of segments. To do this, together with simplified segmentation directives, they began to use the directive for specifying the MODEL memory model, which partially began to control the placement of segments and perform the functions of the ASSUME directive (therefore, when using simplified segmentation directives, the ASSUME directive can be omitted). This directive binds segments, which in the case of using simplified segmentation directives, have predefined names, with segment registers (although you still have to explicitly initialize ds).

The syntax of the MODEL directive is shown in Figure 19.

Rice. 19. Syntax of the MODEL directive

The mandatory parameter of the MODEL directive is the memory model. This parameter defines the memory segmentation model for the POU. It is assumed that a program module can have only certain types of segments, which are defined by the simplified segment description directives we mentioned earlier. These directives are shown in Table 5.

Table 5. Simplified segment definition directives

The presence of the [name] parameter in some directives indicates that it is possible to define several segments of this type. On the other hand, the existence of several kinds of data segments is due to the requirement to ensure compatibility with some compilers of high-level languages, which create different data segments for initialized and uninitialized data, as well as constants.

When using the MODEL directive, the translator makes available several identifiers that can be accessed during program operation in order to obtain information about certain characteristics of a given memory model (Table 7). Let's list these identifiers and their values ​​(Table 6).

Table 6. Identifiers created by the MODEL directive

We can now complete our discussion of the MODEL directive. The operands of the MODEL directive are used to specify a memory model that defines the set of program segments, the sizes of data and code segments, and the method of linking segments and segment registers. Table 7 shows some values ​​of the "memory model" parameter of the MODEL directive.

Table 7. Memory Models

The "modifier" parameter of the MODEL directive makes it possible to clarify some features of the use of the selected memory model (Table 8).

Table 8. Memory model modifiers

The optional parameters "language" and "language modifier" define some features of procedure calls. The need to use these parameters arises when writing and linking programs in various programming languages.

The standard and simplified segmentation directives we have described are not mutually exclusive. Standard directives are used when the programmer wishes to have complete control over the placement of segments in memory and their combination with segments from other modules.

Simplified directives are useful for simple programs and programs intended for linking with program modules written in high-level languages. This allows the linker to efficiently link modules from different languages ​​by standardizing linking and management.

LECTURE No. 17. Command structures in Assembler

1. Machine instruction structure

A machine command is an instruction to the microprocessor encoded according to certain rules to perform some operation or action. Each command contains elements that define:

1) what to do? (The answer to this question is given by the command element called the operation code (COP).);

2) objects on which something needs to be done (these elements are called operands);

3) how to do? (These elements are called operand types and are usually implicit.)

The machine instruction format shown in Figure 20 is the most general. The maximum length of a machine instruction is 15 bytes. A real command may contain a much smaller number of fields, up to one - only COP.

Rice. 20. Machine instruction format

Let us describe the purpose of the machine instruction fields.

1. Prefixes.

Optional machine instruction elements, each of which is 1 byte or may be omitted. In memory, prefixes precede the command. The purpose of prefixes is to modify the operation performed by the command. An application can use the following types of prefixes:

1) segment replacement prefix. Explicitly specifies which segment register is used in this instruction to address the stack or data. The prefix overrides the default segment register selection. Segment replacement prefixes have the following meanings:

a) 2eh - replacement of segment cs;

b) 36h - replacement of segment ss;

c) 3eh - replacement of segment ds;

d) 26h - replacement of segment es;

e) 64h - replacement of segment fs;

e) 65h - replacement of segment gs;

2) the address bitness prefix specifies the bitness of the address (32- or 16-bit). Each instruction that uses an address operand is assigned the bit width of that operand's address. This address can be 16 or 32 bits. If the address length for this command is 16 bits, this means that the command contains a 16-bit offset (Fig. 20), it corresponds to a 16-bit offset of the address operand relative to the beginning of some segment. In the context of Figure 21, this offset is called the effective address. If the address is 32 bits, this means that the command contains a 32-bit offset (Fig. 20), it corresponds to a 32-bit offset of the address operand relative to the beginning of the segment, and its value forms a 32-bit offset in the segment. With the address bitness prefix, you can change the default address bitness. This change will only affect the command preceded by the prefix;

Rice. 21. The mechanism of formation of a physical address in real mode

3) Operand bit width prefix is ​​similar to address bit width prefix, but indicates the operand bit length (32-bit or 16-bit) with which the instruction operates. What are the rules for setting address and operand bit width attributes by default?

In real mode and virtual 18086 mode, the values ​​of these attributes are 16 bits. In protected mode, the attribute values ​​depend on the state of the D bit in the executable segment descriptors. If D = 0, then default attribute values ​​are 16 bits; if D = 1, then 32 bits.

Prefix values ​​for operand width 66h and address width 67h. With the real mode address bit prefix, you can use 32-bit addressing, but be aware of the 64 KB segment size limit. Similar to the address-width prefix, you can use the real-mode operand-width prefix to work with 32-bit operands (for example, in arithmetic instructions);

4) the repetition prefix is ​​used with chain commands (line processing commands). This prefix "loops" the command to process all elements of the chain. The command system supports two types of prefixes:

a) unconditional (rep - OOh), forcing the chained command to be repeated a certain number of times;

b) conditional (repe/repz - OOh, repne/repnz - 0f2h), which, when looping, check some flags, and as a result of the check, early exit from the loop is possible.

2. Operation code.

Required element that describes the operation performed by the command. Many commands correspond to several operation codes, each of which determines the nuances of the operation. The subsequent fields of the machine instruction determine the location of the operands involved in the operation and the specifics of their use. Consideration of these fields is connected with the ways of specifying operands in a machine instruction and therefore will be performed later.

3. Addressing mode byte modr/m.

The value of this byte determines the operand address form used. Operands can be in memory in one or two registers. If the operand is in memory, then the modr/m byte specifies the components (offset, base and index registers) used to calculate its effective address (Figure 21). In protected mode, the sib byte (Scale-Index-Base) can additionally be used to determine the location of the operand in memory. The modr/m byte consists of three fields (Fig. 20):

1) the mod field determines the number of bytes occupied by the operand address in the command (Fig. 20, the offset field in the command). The mod field is used in conjunction with the r/m field, which specifies how to modify the address of the "instruction offset" operand. For example, if mod = 00, this means that there is no offset field in the command, and the address of the operand is determined by the contents of the base and (or) index register. Which registers will be used to calculate the effective address is determined by the value of this byte. If mod = 01, this means that the offset field is present in the command, occupies 1 byte and is modified by the contents of the base and (or) index register. If mod = 10, this means that the offset field is present in the command, occupies 2 or 4 bytes (depending on the default or prefix-defined address size), and is modified by the contents of the base and/or index register. If mod = 11, this means that there are no operands in memory: they are in registers. The same value of the mod byte is used when an immediate operand is used in the instruction;

2) the reg/cop field determines either the register located in the command in place of the first operand, or a possible extension of the opcode;

3) the r/m field is used in conjunction with the mod field and determines either the register located in the command at the place of the first operand (if mod = 11), or the base and index registers used to calculate the effective address (together with the offset field in the command).

4. Byte scale - index - base (byte sib).

Used to expand the possibilities of addressing operands. The presence of the sib byte in a machine instruction is indicated by a combination of one of the values ​​01 or 10 of the mod field and the value of the r/m = 100 field. The sib byte consists of three fields:

1) scale fields ss. This field contains the scaling factor for the index component index, which occupies the next 3 bits of the sib byte. The ss field can contain one of the following values: 1, 2, 4, 8.

When calculating the effective address, the contents of the index register will be multiplied by this value;

2) index fields. Used to store the index register number that is used to calculate the effective address of the operand;

3) base fields. Used to store the base register number, which is also used to calculate the effective address of the operand. Almost all general purpose registers can be used as base and index registers.

5. Offset field in command.

An 8-, 16-, or 32-bit signed integer representing, in whole or in part (subject to the above considerations), the value of the effective address of the operand.

6. The field of the immediate operand.

An optional field that is an 8-bit, 16-bit, or 32-bit immediate operand. The presence of this field is, of course, reflected in the value of the modr/m byte.

2. Methods for specifying instruction operands

The operand is set implicitly at the firmware level

In this case, the instruction explicitly contains no operands. The command execution algorithm uses some default objects (registers, flags in eflags, etc.).

For example, the cli and sti commands implicitly work with the if interrupt flag in the eflags register, and the xlat command implicitly accesses the al register and a line in memory at the address specified by the ds:bx register pair.

The operand is specified in the instruction itself (immediate operand)

The operand is in the instruction code, that is, it is part of it. To store such an operand in a command, a field up to 32 bits long is allocated (Figure 20). The immediate operand can only be the second (source) operand. The destination operand can be either in memory or in a register.

For example: mov ax,0ffffti moves the hexadecimal constant ffff into register ax. The add sum, 2 command adds the contents of the field at the address sum with the integer 2 and writes the result at the place of the first operand, i.e. into memory.

The operand is in one of the registers

Register operands are specified by register names. Registers can be used:

1) 32-bit registers EAX, EBX, ECX, EDX, ESI, EDI, ESP, EUR;

2) 16-bit registers AX, BX, CX, DX, SI, DI, SP, BP;

3) 8-bit registers AH, AL, BH, BL, CH, CL, DH, DL;

4) segment registers CS, DS, SS, ES, FS, GS.

For example, the add ax,bx instruction adds the contents of registers ax and bx and writes the result to bx. The dec si command decrements the contents of si by 1.

The operand is in memory

This is the most complex and at the same time the most flexible way to specify operands. It allows you to implement the following two main types of addressing: direct and indirect.

In turn, indirect addressing has the following varieties:

1) indirect base addressing; its other name is register indirect addressing;

2) indirect base addressing with offset;

3) indirect index addressing with offset;

4) indirect base index addressing;

5) indirect base index addressing with offset.

The operand is an I/O port

In addition to the RAM address space, the microprocessor maintains an I/O address space, which is used to access I/O devices. The I/O address space is 64 KB. Addresses are allocated for any computer device in this space. A particular address value within this space is called an I/O port. Physically, the I / O port corresponds to a hardware register (not to be confused with a microprocessor register), which is accessed using special assembler instructions in and out.

For example:

in al,60h; enter a byte from port 60h

Registers addressed by an I/O port can be 8,16, 32, or XNUMX bits wide, but the register bit width is fixed for a particular port. The in and out commands operate on a fixed range of objects. The so-called accumulator registers EAX, AX, AL are used as a source of information or a recipient. The choice of register is determined by the bitness of the port. The port number can be specified as an immediate operand in the in and out instructions, or as a value in the DX register. The last method allows you to dynamically determine the port number in the program.

The operand is on the stack

Instructions may have no operands at all, may have one or two operands. Most instructions require two operands, one of which is the source operand and the other is the destination operand. It is important that one operand can be located in a register or memory, and the second operand must be in a register or directly in the instruction. An immediate operand can only be a source operand. In a two-operand machine instruction, the following combinations of operands are possible:

1) register - register;

2) register - memory;

3) memory - register;

4) immediate operand - register;

5) immediate operand - memory.

There are exceptions to this rule regarding:

1) chain commands that can move data from memory to memory;

2) stack commands that can transfer data from memory to a stack that is also in memory;

3) commands of the multiplication type, which, in addition to the operand specified in the command, also use a second, implicit operand.

Of the listed combinations of operands, register - memory and memory - register are most often used. In view of their importance, we will consider them in more detail. We will accompany the discussion with examples of assembler instructions that will show how the format of an assembler instruction changes when one or another type of addressing is applied. In this regard, look again at Figure 21, which shows the principle of forming a physical address on the address bus of the microprocessor. It can be seen that the address of the operand is formed as the sum of two components - the contents of the segment register shifted by 4 bits and the 16-bit effective address, which is generally calculated as the sum of three components: base, offset and index.

3. Addressing methods

We list and then consider the features of the main types of addressing operands in memory:

1) direct addressing;

2) indirect basic (register) addressing;

3) indirect basic (register) addressing with offset;

4) indirect index addressing with offset;

5) indirect base index addressing;

6) indirect base index addressing with offset.

Direct Addressing

This is the simplest form of addressing an operand in memory, since the effective address is contained in the instruction itself and no additional sources or registers are used to form it. The effective address is taken directly from the machine instruction offset field (see Figure 20), which can be 8, 16, 32 bits in size. This value uniquely identifies the byte, word, or double word located in the data segment.

Direct addressing can be of two types.

Relative direct addressing

Used for conditional jump instructions to indicate the relative jump address. The relativity of such a transition lies in the fact that the offset field of the machine instruction contains an 8-, 16- or 32-bit value, which, as a result of the operation of the instruction, will be added to the contents of the ip/eip instruction pointer register. As a result of this addition, the address is obtained, to which the transition is carried out.

Absolute direct addressing

In this case, the effective address is part of the machine instruction, but this address is formed only from the value of the offset field in the instruction. To form the physical address of the operand in memory, the microprocessor adds this field with the value of the segment register shifted by 4 bits. Several forms of this addressing can be used in an assembler instruction.

But such addressing is rarely used - commonly used cells in the program are assigned symbolic names. During translation, the assembler calculates and substitutes the offset values ​​of these names into the machine instruction it generates in the "instruction offset" field. As a result, it turns out that the machine instruction directly addresses its operand, having, in fact, in one of its fields the value of the effective address.

Other types of addressing are indirect. The word "indirect" in the name of these types of addressing means that only a part of the effective address can be in the instruction itself, and its remaining components are in registers, which are indicated by their contents by the modr/m byte and, possibly, by the sib byte.

Indirect basic (register) addressing

With this addressing, the effective address of the operand can be in any of the general purpose registers, except for sp / esp and bp / ebp (these are specific registers for working with a stack segment). Syntactically, in a command, this addressing mode is expressed by enclosing the register name in square brackets []. For example, the instruction mov ax, [ecx] places in registers ax the contents of the word at the address from the data segment with the offset stored in register esx. Since the contents of the register can be easily changed during the course of the program, this addressing method allows you to dynamically assign the address of an operand for some machine instruction. This property is very useful, for example, for organizing cyclic calculations and for working with various data structures such as tables or arrays.

Indirect base (register) addressing with offset

This type of addressing is an addition to the previous one and is designed to access data with a known offset relative to some base address. This type of addressing is convenient to use to access the elements of data structures, when the offset of the elements is known in advance, at the stage of program development, and the base (starting) address of the structure must be calculated dynamically, at the stage of program execution. Modification of the contents of the base register allows you to access the elements of the same name in different instances of the same type of data structures.

For example, the instruction mov ax,[edx+3h] transfers the words from the memory area to the registers ax at the address: the contents of edx + 3h.

The mov ax,mas[dx] instruction moves a word into register ax at the address: the contents of dx plus the value of the identifier mas (remember that the compiler assigns to each identifier a value equal to the offset of this identifier from the beginning of the data segment).

Indirect index addressing with offset

This kind of addressing is very similar to indirect base addressing with an offset. Here, too, one of the general purpose registers is used to form the effective address. But index addressing has one interesting feature that is very convenient for working with arrays. It is connected with the possibility of the so-called scaling of the contents of the index register. What it is?

Look at Figure 20. We are interested in the sib byte. When discussing the structure of this byte, we noted that it consists of three fields. One of these fields is the ss scale field, by which the contents of the index register are multiplied.

For example, in the mov ax,mas[si*2] instruction, the value of the effective address of the second operand is calculated by the expression mas+(si)*2. Due to the fact that the assembler does not have the means to organize array indexing, the programmer has to organize it on his own.

The ability to scale significantly helps in solving this problem, but provided that the size of the array elements is 1, 2, 4 or 8 bytes.

Indirect base index addressing

With this type of addressing, the effective address is formed as the sum of the contents of two general-purpose registers: base and index. These registers can be any general-purpose registers, and scaling of the contents of an index register is often used.

Indirect base index addressing with offset

This kind of addressing is the complement of indirect indexed addressing. The effective address is formed as the sum of three components: the contents of the base register, the contents of the index register, and the value of the offset field in the command.

For example, the mov eax,[esi+5] [edx] instruction moves a double word to the eax register at the address: (esi) + 5 + (edx).

The add ax,array[esi] [ebx] command adds the contents of register ax to the contents of the word at the address: the value of the identifier array + (esi) + (ebx).

LECTURE No. 18. Teams

1. Data transfer commands

For the convenience of practical application and reflection of their specifics, it is more convenient to consider the commands of this group in accordance with their functional purpose, according to which they can be divided into the following groups of commands:

1) general purpose data transfers;

2) input-output to the port;

3) work with addresses and pointers;

4) data transformations;

5) work with the stack.

General Data Transfer Commands

This group includes the following commands:

1) mov is the basic data transfer command. It implements a wide variety of shipping options. Note the specifics of this command:

a) the mov command cannot transfer from one memory area to another. If such a need arises, then any currently available general-purpose register should be used as an intermediate buffer;

b) it is impossible to load a value directly from memory into a segment register. Therefore, to perform such a load, you need to use an intermediate object. This may be a general purpose register or a stack;

c) you cannot transfer the contents of one segment register to another segment register. This is because there is no corresponding opcode in the command system. But the need for such action often arises. You can perform such a transfer using the same general-purpose registers as intermediate ones;

d) you cannot use the segment register CS as a destination operand. The reason is simple. The fact is that in the architecture of the microprocessor, the cs: ip pair always contains the address of the command that should be executed next. Changing the contents of the CS register with the mov command would actually mean a jump operation, not a transfer, which is unacceptable. 2) xchg - used for bidirectional data transfer. For this operation, of course, you can use a sequence of several mov instructions, but due to the fact that the exchange operation is used quite often, the developers of the microprocessor instruction system considered it necessary to introduce a separate xchg exchange instruction. Naturally, the operands must be of the same type. It is not allowed (as for all assembler instructions) to exchange the contents of two memory cells with each other.

Port I/O Commands

Look at Figure 22. It shows a highly simplified, conceptual diagram of computer hardware control.

Rice. 22. Conceptual diagram of computer hardware control

As you can see from Figure 22, the lowest level is the BIOS level, where the hardware is handled directly through the ports. This implements the concept of equipment independence. When replacing hardware, it will only be necessary to correct the corresponding BIOS functions, reorienting them to new addresses and the logic of the ports.

Fundamentally, managing devices directly through ports is easy. Information about port numbers, their bit depth, control information format is given in the technical description of the device. You only need to know the ultimate goal of your actions, the algorithm in accordance with which a particular device works, and the order of programming its ports, i.e., in fact, you need to know what and in what sequence you need to send to the port (when writing to it) or read from it (when reading) and how this information should be interpreted. To do this, just two commands that are present in the microprocessor command system are enough:

1) in accumulator, port_number - input to the accumulator from the port with number port_number;

2) out port, accumulator - output the contents of the accumulator to the port with the number port_number.

Commands for working with addresses and memory pointers

When writing programs in assembler, intensive work is done with the addresses of operands that are in memory. To support this kind of operations, there is a special group of commands, which includes the following commands:

1) lea destination, source - effective address loading;

2) Ids destination, source - loading the pointer into the data segment register ds;

3) les destination, source - loading the pointer into the register of the additional data segment es;

4) lgs destination, source - loading the pointer into the register of the additional data segment gs;

5) lfs destination, source - loading the pointer into the register of the additional data segment fs;

6) lss destination, source - load pointer into stack segment register ss.

The lea command is similar to the mov command in that it also performs a move. However, the lea instruction does not transfer data, but the effective address of the data (that is, the offset of the data from the beginning of the data segment) to the register indicated by the destination operand.

Often, to perform some action in a program, it is not enough to know the value of the effective data address alone, but it is necessary to have a full pointer to the data. A complete data pointer consists of a segment component and an offset. All other commands of this group allow you to get such a full pointer to an operand in memory in a pair of registers. In this case, the name of the segment register, in which the segment component of the address is placed, is determined by the operation code. Accordingly, the offset is placed in the general register indicated by the destination operand.

But not everything is so simple with the source operand. In fact, in a command, as a source, you cannot directly specify the name of the operand in memory, to which we would like to receive a pointer. First, you need to get the value of the full pointer in some memory area and specify the full address of the name of this area in the get command. To perform this action, you need to remember the directives for reserving and initializing memory.

When applying these directives, a special case is possible when the name of another data definition directive (in fact, the name of a variable) is specified in the operand field. In this case, the address of this variable is formed in memory. Which address will be generated (effective or complete) depends on the applied directive. If it is dw, then only the 16-bit value of the effective address is formed in memory; if it is dd, the full address is written to memory. The location of this address in memory is as follows: the low word contains the offset, the high word contains the 16-bit segment component of the address.

For example, when organizing work with a chain of characters, it is convenient to place its starting address in a certain register and then modify this value in a loop for sequential access to the elements of the chain.

The need to use commands to obtain a full data pointer in memory, i.e., the address of the segment and the offset value within the segment, arises, in particular, when working with chains.

Data conversion commands

Many microprocessor instructions can be attributed to this group, but most of them have certain features that require them to be attributed to other functional groups. Therefore, out of the entire set of microprocessor commands, only one command can be directly attributed to data conversion commands: xlat [address_of_transcoding_table]

This is a very interesting and useful team. Its effect is that it replaces the value in the al register with another byte from the memory table located at the address specified by the remap_table_address operand.

The word "table" is very conditional, in fact, it's just a string of bytes. The address of the byte in the string that will replace the contents of the al register is determined by the sum (bx) + (al), i.e. the contents of al acts as an index in the byte array.

When working with the xlat command, pay attention to the following subtle point. Even though the command specifies the address of the byte string from which the new value is to be retrieved, this address must be preloaded (for example, using the lea command) into the bx register. Thus, the lookup_table_address operand is not really needed (the operand's optionality is shown by enclosing it in square brackets). As for the byte string (transcoding table), it is a memory area from 1 to 255 bytes in size (the range of an unsigned number in an 8-bit register).

Stack Commands

This group is a set of specialized commands focused on organizing flexible and efficient work with the stack.

The stack is an area of ​​memory specially allocated for temporary storage of program data. The importance of the stack is determined by the fact that a separate segment is provided for it in the program structure. In case the programmer forgot to declare a stack segment in his program, the tlink linker will issue a warning message.

The stack has three registers:

1) ss - stack segment register;

2) sp/esp - stack pointer register;

3) bp/ebp - stack frame base pointer register.

The stack size depends on the operating mode of the microprocessor and is limited to 64 KB (or 4 GB in protected mode).

Only one stack is available at a time, the segment address of which is contained in the SS register. This stack is called the current stack. In order to refer to another stack ("switch the stack"), it is necessary to load another address into the ss register. The SS register is automatically used by the processor to execute all instructions that work on the stack.

We list some more features of working with the stack:

1) writing and reading data on the stack is carried out in accordance with the LIFO principle,

2) as data is written to the stack, the latter grows towards lower addresses. This feature is embedded in the algorithm of commands for working with the stack;

3) when using the esp/sp and ebp/bp registers for memory addressing, the assembler automatically considers that the values ​​contained in it are offsets relative to the ss segment register.

In general, the stack is organized as shown in Figure 23.

Rice. 23. Conceptual diagram of stack organization

The SS, ESP/SP and EUR/BP registers are designed to work with the stack. These registers are used in a complex way, and each of them has its own functional purpose.

The ESP/SP register always points to the top of the stack, that is, it contains the offset at which the last element was pushed onto the stack. The stack instructions implicitly change this register so that it always points to the last element pushed onto the stack. If the stack is empty, then the value of esp is equal to the address of the last byte of the segment allocated for the stack. When an element is pushed onto the stack, the processor decrements the value of the esp register, and then writes the element to the address of the new vertex. When popping data from the stack, the processor copies the element located at the vertex address, and then increments the value of the stack pointer register esp. Thus, it turns out that the stack grows down, in the direction of decreasing addresses.

What if we need to access elements not at the top, but inside the stack? To do this, use the EBP register. The EBP register is the stack frame base pointer register.

For example, a typical trick when entering a subroutine is to pass the desired parameters by pushing them onto the stack. If the subroutine is also actively working with the stack, then access to these parameters becomes problematic. The way out is to save the address of the top of the stack in the frame (base) pointer of the stack after writing the necessary data to the stack - the EUR register. The value in EUR can later be used to access the passed parameters.

The beginning of the stack is located at higher memory addresses. In Figure 23, this address is denoted by the pair ss: fffF. The shift of wT is given here conditionally. In reality, this value is determined by the value that the programmer specifies when describing the stack segment in his program.

To organize work with the stack, there are special commands for writing and reading.

1. push source - writing the value of the source to the top of the stack.

Of interest is the algorithm of this command, which includes the following actions (Fig. 24):

1) (sp) = (sp) - 2; the value of sp is reduced by 2;

2) the value from the source is written to the address specified by the ss: sp pair.

Rice. 24. How the push command works

2. pop assignment - writing the value from the top of the stack to the location specified by the destination operand. The value is "popped" from the top of the stack. The algorithm of the pop command is the reverse of the algorithm of the push command (Fig. 25):

1) writing the contents of the top of the stack at the location indicated by the destination operand;

2) (sp) = (sp) + 2; increasing the value of sp.

Rice. 25. How the pop command works

3. pusha - a group write command to the stack. By this command, the registers ax, cx, dx, bx, sp, bp, si, di are sequentially written to the stack. Note that the original contents of sp are written, that is, the content that was before the pusha command was issued (Fig. 26).

Rice. 26. How the pusha command works

4. pushaw is almost synonymous with the pusha command. What's the difference? The bitness attribute can be either use16 or use32. Let's look at how the pusha and pushaw commands work with each of these attributes:

1) use16 - the pushaw algorithm is similar to the pusha algorithm;

2) use32 - pushaw does not change (i.e. it is insensitive to segment width and always works with word-sized registers - ax, cx, dx, bx, sp, bp, si, di). The pusha command is sensitive to the set segment width and when a 32-bit segment is specified, it works with the corresponding 32-bit registers, i.e. eax, esx, edx, ebx, esp, ebp, esi, edi.

5. pushad - performed similarly to the pusha command, but there are some peculiarities.

The following three commands perform the reverse of the above commands:

1) rora;

2) popaw;

3) pop.

The group of instructions described below allows you to save the flag register on the stack and write a word or double word to the stack. Note that the instructions listed below are the only ones in the microprocessor instruction set that allow (and require) access to the entire contents of the flag register.

1. pushf - saves the register of flags on the stack.

The operation of this command depends on the segment size attribute:

1) use 16 - the flags register of 2 bytes in size is written to the stack;

2) use32 - the eflags register of 4 bytes is written to the stack.

2. pushfw - saving a word-sized register of flags on the stack. Always works like pushf with the use16 attribute.

3. pushfd - saving the flags or eflags flags register on the stack depending on the bit width attribute of the segment (ie, the same as pushf).

Similarly, the following three commands perform the reverse of the operations discussed above:

1) popf;

2) popftv;

3) popfd.

And in conclusion, we note the main types of operations when the use of the stack is almost inevitable:

1) calling subroutines;

2) temporary storage of register values;

3) definition of local variables.

2. Arithmetic commands

The microprocessor can perform integer and floating point operations. To do this, its architecture has two separate blocks:

1) a device for performing integer operations;

2) a device for performing floating point operations.

Each of these devices has its own command system. In principle, an integer device can take over many of the functions of a floating point device, but this will be computationally expensive. For most problems using assembly language, integer arithmetic is sufficient.

Overview of a group of arithmetic instructions and data

An integer computing device supports a little more than a dozen arithmetic instructions. Figure 27 shows the classification of commands in this group.

Rice. 27. Classification of arithmetic commands

The group of integer arithmetic instructions works with two types of numbers:

1) integer binary numbers. Numbers may or may not have a signed digit, i.e., be signed or unsigned numbers;

2) integer decimal numbers.

Consider the machine formats in which these data types are stored.

Integer binary numbers

A fixed-point binary integer is a number encoded in the binary number system.

The dimension of a binary integer can be 8, 16 or 32 bits. The sign of a binary number is determined by how the most significant bit in the representation of the number is interpreted. This is 7,15 or 31st bits for numbers of the corresponding dimension. At the same time, it is interesting that among the arithmetic commands there are only two commands that really take into account this most significant bit as a sign one - these are the integer multiplication and division commands imul and idiv. In other cases, the responsibility for actions with signed numbers and, accordingly, with a sign bit lies with the programmer. The range of values ​​of a binary number depends on its size and interpretation of the most significant bit either as the most significant bit of the number or as the sign bit of the number (Table 9).

Table 9. Range of binary numbers Decimal numbers

Decimal numbers are a special type of representation of numerical information, which is based on the principle of encoding each decimal digit of a number by a group of four bits. In this case, each byte of the number contains one or two decimal digits in the so-called binary-coded decimal code (BCD - Binary-Coded Decimal). The microprocessor stores BCD numbers in two formats (Fig. 28):

1) packed format. In this format, each byte contains two decimal digits. A decimal digit is a 0-bit binary value between 9 and 4. In this case, the code of the highest digit of the number occupies the highest 4 bits. Therefore, the range of representation of a decimal packed number in 1 byte is from 00 to 99;

2) unpackaged format. In this format, each byte contains one decimal digit in the four least significant bits. The upper 4 bits are set to zero. This is the so-called zone. Therefore, the range of representing a decimal unpacked number in 1 byte is from 0 to 9.

Rice. 28. Representation of BCD numbers

How to describe binary decimal numbers in a program? To do this, you can use only two data description and initialization directives - db and dt. The possibility of using only these directives to describe BCD numbers is due to the fact that the principle of "low byte at low address" is also applicable to such numbers, which is very convenient for their processing. And in general, when using such a data type as BCD numbers, the order in which these numbers are described in the program and the algorithm for processing them is a matter of taste and personal preferences of the programmer. This will become clear after we look at the basics of working with BCD numbers below.

Arithmetic operations on binary integers

Adding unsigned binary numbers

The microprocessor performs the addition of operands according to the rules for adding binary numbers. There are no problems as long as the value of the result does not exceed the dimensions of the operand field. For example, when adding byte-sized operands, the result must not exceed the number 255. If this happens, then the result is incorrect. Let's consider why this happens.

For example, let's do the addition: 254 + 5 = 259 in binary. 11111110 + 0000101 = 1 00000011. The result went beyond 8 bits and its correct value fits into 9 bits, and the value 8 remained in the 3-bit field of the operand, which, of course, is not true. In the microprocessor, this outcome of the addition is predicted and special means are provided for fixing such situations and processing them. So, to fix the situation of going beyond the bit grid of the result, as in this case, the carry flag cf is intended. It is located in bit 0 of the EFLAGS/FLAGS flag register. It is the setting of this flag that fixes the fact of the transfer of one from the high order of the operand. Naturally, the programmer must take into account the possibility of such an outcome of the addition operation and provide means for correction. This involves including sections of code after the addition operation in which the cf flag is parsed. This flag can be parsed in various ways.

The easiest and most accessible is to use the jcc conditional branch command. This instruction has as its operand the name of the label in the current code segment. The transition to this label is carried out if, as a result of the operation of the previous command, the cf flag is set to 1. There are three binary addition commands in the microprocessor command system:

1) inc operand - increment operation, i.e. increase the value of the operand by 1;

2) add operand_1, operand_2 - addition instruction with the principle of operation: operand_1 = operand_1 + operand_2;

3) adc operand_1, operand_2 - addition instruction taking into account the carry flag cf. Command operation principle: operand_1 = operand_1 + operand_2 + value_sG.

Pay attention to the last command - this is the addition command, which takes into account the transfer of one from the high order. We have already considered the mechanism for the appearance of such a unit. Thus, the adc instruction is a microprocessor tool for adding long binary numbers, the dimensions of which exceed the lengths of standard fields supported by the microprocessor.

Signed Binary Addition

In fact, the microprocessor is "unaware" of the difference between signed and unsigned numbers. Instead, he has the means of fixing the occurrence of characteristic situations that develop in the process of calculations. We covered some of them when discussing unsigned addition:

1) the cf carry flag, setting it to 1 indicates that the operands were out of range;

2) the adc command, which takes into account the possibility of such an exit (carry from the least significant bit).

Another means is to register the state of the most significant (sign) bit of the operand, which is done using the overflow flag of in the EFLAGS register (bit 11).

Of course, you remember how numbers are represented in a computer: positive - in binary, negative - in two's complement. Consider various options for adding numbers. The examples are intended to show the behavior of the two most significant bits of the operands and the correctness of the result of the addition operation.

Example

= 30566 0111011101100110

+

00687 = 00000010

=

31253 = 01111010

We monitor the transfers from the 14th and 15th digits and the correctness of the result: there are no transfers, the result is correct.

Example

= 30566 0111011101100110

+

= 30566 0111011101100110

=

1132 = 11101110

There was a transfer from the 14th category; there is no transfer from the 15th category. The result is incorrect, because there is an overflow - the value of the number turned out to be greater than what a 16-bit signed number (+32 767) can have.

Example

-30566 = 10001000 10011010

+

-04875 = 11101100 11110101

=

-35441 = 01110101 10001111

There was a transfer from the 15th digit, there is no transfer from the 14th digit. The result is incorrect, because instead of a negative number, it turned out to be positive (the most significant bit is 0).

Example

-4875 = 11101100 11110101

+

-4875 = 11101100 11110101

=

09750 = 11011001

There are transfers from the 14th and 15th bits. The result is correct.

Thus, we examined all cases and found out that the overflow situation (setting the OF flag to 1) occurs during the transfer:

1) from the 14th digit (for signed positive numbers);

2) from the 15th digit (for negative numbers).

Conversely, no overflow occurs (ie, the OF flag is reset to 0) if there is a carry from both bits, or if there is no carry in both bits.

So overflow is registered with the overflow flag of. In addition to the of flag, when transferring from the high order, the transfer flag CF is set to 1. Since the microprocessor does not know about the existence of signed and unsigned numbers, the programmer is solely responsible for the correct actions with the resulting numbers. You can parse the CF and OF flags with the conditional jump instructions JC\JNC and JO\JNO, respectively.

As for the commands for adding numbers with a sign, they are the same as for numbers without a sign.

Subtraction of unsigned binary numbers

As in the analysis of the operation of addition, we will discuss the essence of the processes that occur when performing the operation of subtraction. If the minuend is greater than the subtrahend, then there is no problem - the difference is positive, the result is correct. If the minuend is less than the subtracted, there is a problem: the result is less than 0, and this is already a signed number. In this case, the result must be wrapped. What does this mean? With the usual subtraction (in a column), they make a loan of 1 from the highest order. The microprocessor does the same, i.e., it takes 1 from the digit following the highest one in the bit grid of the operand. Let's explain with an example.

Example

05 = 00000000

-10 = 00000000 00001010

To do the subtraction, let's do

senior imaginary loan:

100000000 00000101

-

00000000 00001010

=

11111111 11111011

Thus, in essence, the action

(65 + 536) - 5 = 10

0 here is, as it were, equivalent to the number 65536. The result, of course, is incorrect, but the microprocessor considers that everything is fine, although it fixes the fact of borrowing a unit by setting the carry flag cf. But look again carefully at the result of the subtraction operation. It's -5 in two's complement! Let's conduct an experiment: represent the difference as a sum of 5 + (-10).

Example

5 = 00000000

+

(-10)= 11111111 11110110

=

11111111 11111011

i.e. we got the same result as in the previous example.

Thus, after the command to subtract unsigned numbers, it is necessary to analyze the state of the CE flag. If it is set to 1, then this indicates that a borrow has occurred from the high order and the result is in the additional code.

Like the addition instructions, the group of subtraction instructions consists of the smallest possible set. These commands perform subtraction according to the algorithms that we are now considering, and the exceptions must be taken into account by the programmer himself. Subtraction commands include:

1) dec operand - decrement operation, i.e. decrease the value of the operand by 1;

2) sub operand_1, operand_2 - subtraction command; its operating principle: operand_1 = operand_1 - operand_2;

3) sbb operand_1, operand_2 - subtraction command taking into account the loan (ci flag): operand_1 = operand_1 - operand_2 - value_sG.

As you can see, among the subtraction commands there is an sbb command that takes into account the carry flag cf. This command is similar to adc, but now the cf flag acts as an indicator of borrowing 1 from the most significant digit when subtracting numbers.

Signed binary subtraction

Here everything is somewhat more complicated. The microprocessor does not need to have two devices - addition and subtraction. It is enough to have only one - the addition device. But for subtraction by the way of adding numbers with a sign in an additional code, it is necessary to represent both operands - both the reduced and the subtracted. The result should also be treated as a two's complement value. But here difficulties arise. First of all, they are related to the fact that the most significant bit of the operand is considered as a sign bit. Consider the example of subtracting 45 - (-127).

Example

Subtraction of signed numbers 1

45 = 0010

-

-127 = 1000 0001

=

-44 = 1010 1100

Judging by the sign bit, the result turned out to be negative, which, in turn, suggests that the number should be considered as a complement equal to -44. The correct result should be 172. Here, as in the case of signed addition, we met with a mantissa overflow, when the significant bit of the number changed the sign bit of the operand. You can track this situation by the contents of the overflow flag of. Setting it to 1 indicates that the result is out of range of signed numbers (i.e., the most significant bit has changed) for an operand of this size, and the programmer must take action to correct the result.

Example

Subtraction of signed numbers 2

-45-45 = -45 + (-45) = -90.

-45 = 11010011

+

-45 = 11010011

=

-90 = 1010 0110

Everything is fine here, the overflow flag of is reset to 0, and 1 in the sign bit indicates that the result value is a two's complement number.

Subtraction and addition of large operands

If you notice, the addition and subtraction instructions work with operands of a fixed dimension: 8, 16, 32 bits. But what if you need to add numbers of a larger dimension, for example 48 bits, using 16-bit operands? For example, let's add two 48-bit numbers:

Rice. 29. Adding large operands

Figure 29 shows the technology for adding long numbers step by step. It can be seen that the process of adding multi-byte numbers occurs in the same way as when adding two numbers "in a column" - with the implementation, if necessary, of transferring 1 to the highest bit. If we manage to program this process, then we will significantly expand the range of binary numbers on which we can perform addition and subtraction operations.

The principle of subtracting numbers with a representation range exceeding the standard operand bit grids is the same as for addition, i.e., the cf carry flag is used. You just need to imagine the process of subtracting in a column and correctly combine the microprocessor instructions with the sbb instruction.

To conclude our discussion of the addition and subtraction instructions, in addition to the cf and of flags, there are several other flags in the eflags register that can be used with binary arithmetic instructions. These are the following flags:

1) zf - zero flag, which is set to 1 if the result of the operation is 0, and to 1 if the result is not equal to 0;

2) sf - sign flag, the value of which after arithmetic operations (and not only) coincides with the value of the most significant bit of the result, i.e. with bit 7, 15 or 31. Thus, this flag can be used for operations on signed numbers.

Multiplication of unsigned numbers

The command for multiplying unsigned numbers is

mul factor_1

As you can see, only one multiplier operand is specified in the command. The second operand factor_2 is implicitly specified. Its location is fixed and depends on the size of the factors. Since, in general, the result of a multiplication is greater than any of its factors, its size and location must also be uniquely determined. Options for the sizes of the factors and the placement of the second operand and the result are shown in Table 10.

Table 10. Arrangement of operands and result in multiplication

It can be seen from the table that the product consists of two parts and, depending on the size of the operands, is placed in two places - in place factor_2 (lower part) and in the additional register ah, dx, edx (higher part). How, then, dynamically (i.e., during program execution) to know that the result is small enough to fit in one register, or that it exceeded the dimension of the register and the highest part ended up in another register? To do this, we use the cf and overflow flags already known to us from the previous discussion:

1) if the leading part of the result is zero, then after the product operation the flags cf = 0 and of = 0;

2) if these flags are non-zero, then this means that the result has gone beyond the smallest part of the product and consists of two parts, which should be taken into account in further work.

Multiply signed numbers

The command for multiplying numbers with a sign is

[imul operand_1, operand_2, operand_3]

This command is executed in the same way as the mul command. A distinctive feature of the imul command is only the formation of the sign.

If the result is small and fits in one register (i.e. if cf = of = 0), then the contents of the other register (the high part) is sign extension - all its bits are equal to the high bit (sign bit) of the low part of the result. Otherwise (if cf = of = 1), the sign of the result is the sign bit of the high part of the result, and the sign bit of the low part is the significant bit of the binary result code.

Division of unsigned numbers

The command for dividing unsigned numbers is

div divider

The divisor can be in memory or in a register and be 8, 16, or 32 bits in size. The location of the dividend is fixed and, like in the multiplication instruction, depends on the size of the operands. The result of the division command is the quotient and remainder values.

Options for the location and size of the operands of the division operation are shown in Table 11.

Table 11. Arrangement of operands and result in division

After the divide instruction is executed, the contents of the flags are undefined, but interrupt number 0, called "divide by zero", may occur. This type of interruption belongs to the so-called exceptions. This kind of interrupt occurs inside the microprocessor due to some anomalies during the computing process. Interrupt O, "divide by zero", while executing the div command can occur for one of the following reasons:

1) the divisor is zero;

2) the quotient is not included in the bit grid allocated for it, which can happen in the following cases:

a) when dividing a dividend with a value of a word by a divisor with a value of bytes, and the value of the dividend is more than 256 times greater than the value of the divisor;

b) when dividing a dividend with a value of a double word by a divisor with a value of a word, and the value of the dividend is more than 65 times greater than the value of the divisor;

c) when dividing the dividend with a quadruple word value by a divisor with a double word value, and the value of the dividend is more than 4 times greater than the value of the divisor.

Division with a sign

The command for dividing numbers with a sign is

idiv divider

For this command, all the considered provisions regarding commands and signed numbers are valid. We only note the features of the occurrence of the exception 0, "division by zero", in the case of numbers with a sign. It occurs when executing the idiv command for one of the following reasons:

1) the divisor is zero;

2) the quotient is not included in the bit grid allocated for it.

The latter, in turn, can happen:

1) when dividing a dividend with a signed word value by a divisor with a signed byte value, and the value of the dividend is more than 128 times the value of the divisor (thus, the quotient should not be outside the range from -128 to + 127);

2) when dividing the dividend by a signed double word value by the divisor by a signed word value, and the value of the dividend is more than 32 times the value of the divisor (thus, the quotient must not be outside the range from -768 to +32) ;

3) when dividing the dividend by a signed quadword value by a signed double word divisor, and the value of the dividend is more than 2 times the value of the divisor (thus, the quotient must not be outside the range of -147 to +483 648 2 147).

Auxiliary Instructions for Integer Operations

There are several instructions in the microprocessor instruction set that can make it easier to program algorithms that perform arithmetic calculations. Various problems may arise in them, for the resolution of which the microprocessor developers have provided several commands.

Type conversion commands

What if the sizes of the operands involved in arithmetic operations are different? For example, suppose in an addition operation, one operand is a word and the other is a double word. It was said above that operands of the same format must participate in the addition operation. If the numbers are unsigned, then the output is easy to find. In this case, on the basis of the original operand, a new one (double word format) can be formed, the high bits of which can simply be filled with zeros. The situation is more complicated for signed numbers: how to take into account the sign of the operand dynamically, during program execution? To solve such problems, the microprocessor instruction set has so-called type conversion instructions. These instructions expand bytes into words, words into double words, and double words into quad words (64-bit values). Type conversion instructions are especially useful when converting signed integers, as they automatically fill in the high-order bits of the newly constructed operand with the values ​​of the old object's sign bit. This operation results in integer values ​​of the same sign and the same magnitude as the original, but in a longer format. Such a transformation is called a sign propagation operation.

There are two kinds of type conversion commands.

1. Instructions without operands. These commands work with fixed registers:

1) cbw (Convert Byte to Word) - a command to convert a byte (in the al register) into a word (in the ah register) by spreading the value of the high bit al to all bits of the ah register;

2) cwd (Convert Word to Double) - a command to convert a word (in register ax) into a double word (in registers dx: ax) by spreading the value of the high bit ax to all bits of register dx;

3) cwde (Convert Word to Double) - a command to convert a word (in the register ax) into a double word (in the register eax) by spreading the value of the most significant bit of ax to all bits of the upper half of the eax register;

4) cdq (Convert Double Word to Quarter Word) - a command to convert a double word (in the eax register) into a quadruple word (in the edx: eax registers) by spreading the value of the most significant bit of eax to all bits of the edx register.

2. Commands movsx and movzx related to string processing commands. These commands have a useful property in the context of our problem:

1) movsx operand_1, operand_2 - send with sign propagation. Extends the 8 or 16-bit value of operand_2, which can be a register or a memory operand, to a 16 or 32-bit value in one of the registers, using the value of the sign bit to fill the higher positions of operand_1. This instruction is useful for preparing signed operands for arithmetic operations;

2) movzx operand_1, operand_2 - send with zero extension. Extends the 8-bit or 16-bit value of operand_2 to 16-bit or 32-bit, clearing (filling) the high positions of operand_2 with zeros. This instruction is useful for preparing unsigned operands for arithmetic.

Other useful commands

1. xadd destination, source - exchange and addition.

The command allows you to perform two actions in sequence:

1) exchange destination and source values;

2) place the destination operand in place of the sum: destination = destination + source.

2. neg operand - negation with two's complement.

The instruction inverts the value of the operand. Physically, the command performs one action:

operand = 0 - operand, i.e. subtracts the operand from zero.

The neg operand command can be used:

1) to change the sign;

2) to perform subtraction from a constant.

Arithmetic operations on binary-decimal numbers

In this section, we will look at the specifics of each of the four basic arithmetic operations for packed and unpacked BCD numbers.

The question may rightly arise: why do we need BCD numbers? The answer might be: BCD numbers are needed in business applications, i.e. where the numbers need to be large and precise. As we have already seen on the example of binary numbers, operations with such numbers are quite problematic for assembly language. The disadvantages of using binary numbers include the following:

1) Values ​​in word and double word format have a limited range. If the program is designed to work in the field of finance, then limiting the amount in rubles to 65 (for a word) or even 536 (for a double word) will significantly narrow the scope of its application;

2) the presence of rounding errors. Can you imagine a program running somewhere in a bank that does not take into account the value of the balance when operating with binary integers and operates with billions? I would not like to be the author of such a program. The use of floating point numbers will not save - the same rounding problem exists there;

3) presentation of a large amount of results in symbolic form (ASCII code). Business programs don't just do calculations; one of the purposes of their use is the prompt delivery of information to the user. To do this, of course, the information must be presented in symbolic form. Converting numbers from binary to ASCII requires some computational effort. A floating point number is even more difficult to translate into a symbolic form. But if you look at the hexadecimal representation of an unpacked decimal digit and its corresponding character in the ASCII table, you can see that they differ by 30h. Thus, the conversion to symbolic form and vice versa is much easier and faster.

You have probably already seen the importance of mastering at least the basics of actions with decimal numbers. Next, consider the features of performing basic arithmetic operations with decimal numbers. We immediately note the fact that there are no separate commands for addition, subtraction, multiplication and division of BCD numbers. This was done for quite understandable reasons: the dimension of such numbers can be arbitrarily large. BCD numbers can be added and subtracted, both packed and unpacked, but only unpacked BCD numbers can divide and multiply. Why this is so will be seen from further discussion.

Arithmetic on unpacked BCD numbers

Add unpacked BCD numbers

Let's consider two cases of addition.

Example

The result of addition is not more than 9

6 = 0000

+

3 = 0000

=

9 = 0000

There is no transfer from the junior to the senior tetrad. The result is correct.

Example

The result of addition is greater than 9:

06 = 0000

+

07 = 0000

=

13 = 0000

We have received no longer a BCD number. The result is wrong. The correct result in unpacked BCD format should be 0000 0001 0000 0011 in binary (or 13 in decimal).

After analyzing this problem when adding BCD numbers (and similar problems when performing other arithmetic operations) and possible ways to solve it, the developers of the microprocessor command system decided not to introduce special commands for working with BCD numbers, but to introduce several corrective commands.

The purpose of these instructions is to correct the result of the operation of ordinary arithmetic instructions for cases where the operands in them are BCD numbers.

In the case of subtraction in example 10, it can be seen that the result obtained needs to be corrected. To correct the operation of adding two single-digit unpacked BCD numbers in the microprocessor command system, there is a special command - aaa (ASCII Adjust for Addition) - correction of the result of addition for representation in symbolic form.

This instruction has no operands. It works implicitly only with the al register and parses the value of its lower tetrad:

1) if this value is less than 9, then the cf flag is reset to XNUMX and the transition to the next instruction is carried out;

2) if this value is greater than 9, then the following actions are performed:

a) 6 is added to the contents of the lower tetrad al (but not to the contents of the entire register!) Thus, the value of the decimal result is corrected in the correct direction;

b) the flag cf is set to 1, thereby fixing the transfer to the most significant bit so that it can be taken into account in subsequent actions.

So, in example 10, assuming that the sum value 0000 1101 is in al, after the aaa instruction, the register will have 1101 + 0110 = 0011, i.e. binary 0000 0011 or decimal 3, and the cf flag will be set to 1, i.e. The transfer has been stored in the microprocessor. Next, the programmer will need to use the adc addition instruction, which will take into account the carry from the previous bit.

Subtraction of unpacked BCD numbers

The situation here is quite similar to addition. Let's consider the same cases.

Example

The result of subtraction is not greater than 9:

6 = 0000

-

3 = 0000

=

3 = 0000

As you can see, there is no loan from the senior notebook. The result is correct and does not require correction.

Example

The result of subtraction is greater than 9:

6 = 0000

-

7 = 0000

=

-1 = 1111 1111

Subtraction is carried out according to the rules of binary arithmetic. Therefore, the result is not a BCD number.

The correct result in unpacked BCD format should be 9 (0000 1001 in binary). In this case, a borrow from the most significant digit is assumed, as with a normal subtraction command, i.e. in the case of BCD numbers, subtraction 16 - 7 should actually be performed. Thus, it is clear that, as in the case of addition, the subtraction result must be corrected. For this, there is a special command - aas (ASCII Adjust for Substraction) - correction of the result of subtraction for representation in symbolic form.

The aas instruction also has no operands and operates on the al register, parsing its least-order tetrad as follows:

1) if its value is less than 9, then the cf flag is reset to 0 and control is transferred to the next command;

2) if the value of the tetrad in al is greater than 9, then the aas command performs the following actions:

a) subtracts 6 from the contents of the lower tetrad of register al (note - not from the contents of the entire register);

b) resets the upper tetrad of register al;

c) sets the cf flag to 1, thereby fixing the imaginary high-order borrow.

It is clear that the aas command is used in conjunction with the basic sub and sbb subtraction commands. In this case, it makes sense to use the sub command only once, when subtracting the lowest digits of the operands, then the sbb command should be used, which will take into account a possible loan from the highest order.

Multiplication of unpacked BCD numbers

Using the example of adding and subtracting unpacked numbers, it became clear that there are no standard algorithms for performing these operations on BCD numbers, and the programmer must himself, based on the requirements for his program, implement these operations.

The implementation of the two remaining operations - multiplication and division - is even more complicated. In the microprocessor instruction set, there are only means for the production of multiplication and division of single-digit unpacked BCD numbers.

In order to multiply numbers of arbitrary dimension, you need to implement the multiplication process yourself, based on some multiplication algorithm, for example, "in a column".

In order to multiply two one-digit BCD numbers, you must:

1) place one of the factors in the AL register (as required by the mul instruction);

2) place the second operand in a register or memory, allocating a byte;

3) multiply the factors with the mul command (the result, as expected, will be in ah);

4) the result, of course, will be in binary code, so it needs to be corrected.

To correct the result after multiplication, a special command is used - aam (ASCII Adjust for Multiplication) - correction of the result of multiplication for representation in symbolic form.

It has no operands and operates on the AX register as follows:

1) divides al by 10;

2) the result of division is written as follows: quotient in al, remainder in ah. As a result, after executing the aam instruction, the AL and ah registers contain the correct BCD digits of the product of two digits.

Before we end our discussion of the aam command, we need to note one more use for it. This command can be used to convert a binary number in the AL register into an unpacked BCD number, which will be placed in the ah register: the most significant digit of the result is in ah, the least significant digit is in al. It is clear that the binary number must be in the range 0... 99.

Division of unpacked BCD numbers

The process of dividing two unpacked BCD numbers is somewhat different from the other previously considered operations with them. Correction actions are also required here, but they must be carried out before the main operation that directly divides one BCD number by another BCD number. First, in the register ah, you need to get two unpacked BCD digits of the dividend. This makes the programmer comfortable for him in a way. Next, you need to issue the command aad - aad (ASCII Adjust for Division) - division correction for symbolic representation.

The instruction has no operands and converts the two-digit unpacked BCD number in the ax register to a binary number. This binary number will subsequently play the role of the dividend in the division operation. In addition to the conversion, the aad command places the resulting binary number in the AL register. The dividend will naturally be a binary number from the range 0... 99.

The algorithm by which the aad command performs this conversion is as follows:

1) multiply the highest digit of the original BCD number in ah (the content of AH) by 10;

2) perform the addition AH + AL, the result of which (binary number) is entered in AL;

3) reset the contents of AH.

Next, the programmer needs to issue the usual div division instruction to perform division of the contents of ax by a single BCD digit located in a byte register or byte memory location.

Similar to aash, the aad command can also be used to convert unpacked BCD numbers from the range 0... 99 to their binary equivalent.

To divide numbers of greater capacity, as well as in the case of multiplication, you need to implement your own algorithm, for example, "in a column", or find a more optimal way.

Arithmetic on Packed BCD Numbers

As noted above, packed BCD numbers can only be added and subtracted. To perform other actions on them, they must be additionally converted either to an unpacked format or to a binary representation. Due to the fact that packed BCD numbers are not of great interest, we will consider them briefly.

Adding Packed BCD Numbers

First, let's get to the heart of the problem and try to add two two-digit packed BCD numbers. Example Adding Packed BCD Numbers:

= 67 01100111

+

= 75 01110101

=

142 = 1101 1100 = 220

As you can see, in binary the result is 1101 1100 (or 220 in decimal), which is incorrect. This is because the microprocessor is unaware of the existence of BCD numbers and adds them according to the rules for adding binary numbers. Actually, the result in BCD should be 0001 0100 0010 (or 142 in decimal).

It can be seen that, as for unpacked BCD numbers, for packed BCD numbers there is a need to somehow correct the results of arithmetic operations.

The microprocessor provides for this command daa - daa (Decimal Adjust for Addition) - correction of the result of addition for presentation in decimal form.

The daa command converts the contents of the al register into two packed decimal digits according to the algorithm given in the description of the daa command. The resulting unit (if the result of the addition is greater than 99) is stored in the cf flag, thereby taking into account the transfer to the most significant bit.

Subtraction of packed BCD numbers

Similar to addition, the microprocessor treats the packed BCD numbers as binary and subtracts the BCD numbers as binary accordingly.

Example

Subtraction of packed BCD numbers.

Let's subtract 67-75. Since the microprocessor performs subtraction in the way of addition, we will follow this:

= 67 01100111

+

-75 = 10110101

=

-8 = 0001 1100 = 28

As you can see, the result is 28 in decimal, which is absurd. In BCD, the result should be 0000 1000 (or 8 in decimal).

When programming the subtraction of packed BCD numbers, the programmer, as well as when subtracting unpacked BCD numbers, must control the sign himself. This is done using the CF flag, which fixes the high-order borrow.

The subtraction of BCD numbers itself is done by a simple sub or sbb subtraction command. Correction of the result is carried out by the command das - das (Decimal Adjust for Substraction) - correction of the result of subtraction for representation in decimal form.

The das command converts the contents of the AL register to two packed decimal digits according to the algorithm given in the description of the das command.

LECTURE No. 19. Control transfer commands

1. Logic commands

Along with the means of arithmetic calculations, the microprocessor command system also has means of logical data conversion. By logical means such data transformations, which are based on the rules of formal logic.

Formal logic operates at the level of true and false statements. For a microprocessor, this usually means 1 and 0, respectively. For a computer, the language of zeros and ones is native, but the minimum unit of data with which machine instructions work is a byte. However, at the system level, it is often necessary to be able to operate at the lowest possible level, the bit level.

Rice. 29. Means of logical data processing

The means of logical data transformation include logical commands and logical operations. The operand of an assembler instruction can generally be an expression, which, in turn, is a combination of operators and operands. Among these operators there may be operators that implement logical operations on expression objects.

Before a detailed consideration of these tools, let's consider what the logical data themselves are and what operations are performed on them.

Boolean data

The theoretical basis for logical data processing is formal logic. There are several systems of logic. One of the most famous is the propositional calculus. A proposition is any statement that can be said to be either true or false.

The propositional calculus is a set of rules used to determine the truth or falsity of some combination of propositions.

The propositional calculus is very harmoniously combined with the principles of the computer and the basic methods of its programming. All hardware components of a computer are built on logic chips. The system for representing information in a computer at the lowest level is based on the concept of a bit. A bit, having only two states (0 (false) and 1 (true)), naturally fits into the propositional calculus.

According to the theory, the following logical operations can be performed on statements (on bits).

1. Negation (logical NOT) - a logical operation on one operand, the result of which is the reciprocal of the value of the original operand.

This operation is uniquely characterized by the following truth table (Table 12).

Table 12. Truth table for logical negation

2. Logical addition (logical inclusive OR) - a logical operation on two operands, the result of which is "true" (1) if one or both operands are true (1), and "false" (0) if both operand is false (0).

This operation is described using the following truth table (Table 13).

Table 13. Truth table for logical inclusive OR

3. Logical multiplication (logical AND) - a logical operation on two operands, the result of which is true (1) only if both operands are true (1). In all other cases, the value of the operation is "false" (0).

This operation is described using the following truth table (Table 14).

Table 14. Logic AND truth table

4. Logical exclusive addition (logical exclusive OR) - a logical operation on two operands, the result of which is "true" (1), if only one of the two operands is true (1), and false (0), if both operand is either false (0) or true (1). This operation is described using the following truth table (Table 15).

Table 15. Truth table for logical XOR

The microprocessor instruction set contains five instructions that support these operations. These instructions perform logical operations on the bits of the operands. The dimensions of the operands, of course, must be the same. For example, if the dimension of the operands is equal to a word (16 bits), then the logical operation is performed first on zero bits of the operands, and its result is written to the place of bit 0 of the result. Next, the command sequentially repeats these actions on all bits from the first to the fifteenth.

Logic commands

The microprocessor command system has the following set of commands that support working with logical data:

1) and operand_1, operand_2 - logical multiplication operation. The command performs a bitwise logical AND operation (conjunction) on the bits of the operands operand_1 and operand_2. The result is written in place of operand_1;

2) og operand_1, operand_2 - logical addition operation. The command performs a bitwise logical OR operation (disjunction) on the bits of the operands operand_1 and operand_2. The result is written in place of operand_1;

3) xor operand_1, operand_2 - operation of logical exclusive addition. The command performs a bitwise logical XOR operation on the bits of the operands operand_1 and operand_2. The result is written in place of the operand;

4) test operand_1, operand_2 - "test" operation (using the logical multiplication method). The command performs a bitwise logical AND operation on the bits of the operands operand_1 and operand_2. The state of the operands remains the same, only the flags zf, sf, and pf are changed, which makes it possible to analyze the state of individual bits of the operand without changing their state;

5) not operand - operation of logical negation. The command performs a bitwise inversion (replacing the value with the opposite) of each bit of the operand. The result is written in place of the operand.

In order to understand the role of logical commands in the microprocessor command system, it is very important to understand the areas of their application and typical methods of their use in programming.

With the help of logical commands, it is possible to select individual bits in the operand for the purpose of setting them, resetting them, inverting them, or simply checking for a certain value.

To organize such work with bits, operand_2 usually plays the role of a mask. With the help of the bits of this mask set in bit 1, the operand_1 bits necessary for a particular operation are determined. Let's show what logical commands can be used for this purpose:

1) to set certain digits (bits) to 1, the command og operand_1, operand_2 is used.

In this instruction, operand_2, which acts as a mask, must contain 1 bits in place of those bits that must be set to 1 in operand_XNUMX;

2) to reset certain digits (bits) to 0, the command and operand_1, operand_2 is used.

In this instruction, operand_2, which acts as a mask, must contain zero bits in place of those bits that must be set to 0 in operand_1;

3) command xor operand_1, operand_2 is applied:

a) to find out which bits in operand_1 and operand differ;

b) to invert the state of the specified bits in operand_1.

The mask bits of interest to us (operand_2) when executing the xor command must be single, the rest must be zero;

The command test operand_1, operand_2 (check operand_1) is used to check the status of the specified bits.

The tested bits of operand_1 in the mask (operand_2) must be set to one. The algorithm of the test command is similar to the algorithm of the and command, but it does not change the value of operand_1. The result of the command is to set the value of the zero flag zf:

1) if zf = 0, then as a result of logical multiplication, a zero result was obtained, i.e., one unit bit of the mask, which did not match the corresponding unit bit of the operand;

2) if zf = 1, then as a result of logical multiplication, a non-zero result is obtained, i.e. at least one unit bit of the mask coincides with the corresponding unit bit of operand_1.

To react to the result of the test command, it is advisable to use the jump command jnz label (Jump if Not Zero) - jump if the zero flag zf is non-zero, or the command with the opposite action - jz label (Jump if Zero) - jump if the zero flag zf = 0.

The following two commands search for the first operand bit set to 1. The search can be performed both from the beginning and from the end of the operand:

1) bsf operand_1, operand_2 (Bit Scanning Forward) - scanning bits forward. The instruction searches (scans) the bits of operand_2 from the least significant to the most significant (from bit 0 to the most significant bit) in search of the first bit set to 1. If one is found, operand_1 is filled with the number of this bit as an integer value. If all bits of operand_2 are 0, then the zero flag zf is set to 1, otherwise the zf flag is reset to 0;

2) bsr operand_1, operand_2 (Bit Scanning Reset) - scan bits in reverse order. The instruction searches (scans) the bits of operand_2 from high to low (from most significant bit to bit 0) in search of the first bit set to 1. If one is found, operand_1 is filled with the number of this bit as an integer value. It is important that the position of the first unit bit on the left is still counted relative to bit 0. If all bits of operand_2 are 0, then the zero flag zf is set to 1, otherwise the flag zf is reset to 0.

In the latest models of Intel microprocessors, several more instructions have appeared in the group of logical instructions that allow you to access one specific bit of the operand. The operand can be either in memory or in a general register. The bit position is given by the bit offset relative to the least significant bit of the operand. The offset value can be specified either as a direct value or contained in a general purpose register. You can use the results of the bsr and bsf commands as the offset value. All instructions assign the value of the selected bit to the CE flag.

1) bt operand, bit_offset (Bit Test) - bit test. The instruction transfers the bit value to the cf flag;

2) bts operand, offset_bit (Bit Test and Set) - checking and setting a bit. The instruction transfers the bit value to the CF flag and then sets the bit to be checked to 1;

3) btr operand, bit_offset (Bit Test and Reset) - checking and resetting a bit. The instruction transfers the bit value to the CF flag and then sets this bit to 0;

4) btc operand, offset_bit (Bit Test and Convert) - checking and inverting a bit. The instruction wraps the value of a bit in the cf flag and then inverts the value of that bit.

Shift Commands

The instructions in this group also provide manipulation of individual bits of the operands, but in a different way than the logical instructions discussed above.

All shift instructions move bits in the operand field left or right depending on the opcode. All shift instructions have the same structure - copy operand, shift_count.

The number of shifted bits - shift_counter - is located at the place of the second operand and can be set in two ways:

1) statically, which involves setting a fixed value using a direct operand;

2) dynamically, which means entering the value of the shift counter into the cl register before executing the shift instruction.

Based on the dimension of the cl register, it is clear that the value of the shift counter can range from 0 to 255. But in fact, this is not entirely true. For optimization purposes, the microprocessor accepts only the value of the five least significant bits of the counter, i.e. the value lies in the range from 0 to 31.

All shift instructions set the carry flag cf.

As bits shift out of the operand, they first hit the carry flag, setting it equal to the value of the next bit outside the operand. Where this bit goes next depends on the type of shift instruction and the program algorithm.

Shift commands can be divided into two types according to the principle of operation:

1) linear shift commands;

2) cyclic shift commands.

Linear shift commands

Commands of this type include commands that shift according to the following algorithm:

1) the next bit that is pushed sets the CF flag;

2) the bit entered into the operand from the other end has the value 0;

3) when the next bit is shifted, it goes into the CF flag, while the value of the previous shifted bit is lost! Linear shift commands are divided into two subtypes:

1) logical linear shift commands;

2) arithmetic linear shift instructions.

The logical linear shift commands include the following:

1) shl operand, counter_shifts (Shift Logical Left) - logical shift to the left. The content of the operand is shifted to the left by the number of bits specified by shift_count. On the right (in the position of the least significant bit) zeros are entered;

2) shr operand, shift_count (Shift Logical Right) - logical shift to the right. The content of the operand is shifted to the right by the number of bits specified by shift_count. On the left (in the position of the most significant, sign bit), zeros are entered.

Figure 30 shows how these commands work.

Rice. 30. Scheme of work of commands of linear logical shift

Arithmetic linear shift instructions differ from logical shift instructions in that they operate on the sign bit of the operand in a special way.

1) sal operand, shift_counter (Shift Arithmetic Left) - arithmetic shift to the left. The content of the operand is shifted to the left by the number of bits specified by shift_count. On the right (in the position of the least significant bit), zeros are entered. The sal instruction does not preserve the sign, but sets the flag with / in case of a sign change by the next bit advanced. Otherwise, the sal command is exactly the same as the shl command;

2) sar operand, shift_count (Shift Arithmetic Right) - arithmetic right shift. The content of the operand is shifted to the right by the number of bits specified by shift_count. Zeros are inserted into the operand on the left. The sar command preserves the sign, restoring it after each bit shift.

Figure 31 shows how linear arithmetic shift instructions work.

Rice. 31. Scheme of operation of linear arithmetic shift commands

Rotate Commands

The cyclic shift instructions include instructions that store the values ​​of the shifted bits. There are two types of cyclic shift instructions:

1) simple cyclic shift commands;

2) cyclic shift commands via the carry flag cf.

Simple cyclic shift commands include:

1) rol operand, shift_counter (Rotate Left) - cyclic shift to the left. The content of the operand is shifted to the left by the number of bits specified by the shift_count operand. Left-shifted bits are written to the same operand from the right;

2) gog operand, counter_shifts (Rotate Right) - cyclic shift to the right. The contents of the operand are shifted to the right by the number of bits specified by the shift_count operand. Right-shifted bits are written to the same operand on the left.

Rice. 32. Scheme of operation of commands of a simple cyclic shift

As can be seen from Figure 32, the instructions of a simple cyclic shift in the course of their work perform one useful action, namely: the cyclically shifted bit is not only inserted into the operand from the other end, but at the same time its value becomes the value of the CE flag.

The cyclic shift commands through the carry flag CF differ from the simple cyclic shift commands in that the shifted bit does not immediately enter the operand from its other end, but is first written to the carry flag CE Only the next execution of this shift command (provided that it is executed in loop) causes the previously advanced bit to be placed at the other end of the operand (Fig. 33).

The following are related to the cyclic shift commands via the carry flag:

1) rcl operand, shift_count (Rotate through Carry Left) - cyclic left shift through carry.

The content of the operand is shifted to the left by the number of bits specified by the shift_count operand. The shifted bits in turn become the value of the carry flag cf.

2) rsg operand, shift_count (Rotate through Carry Right) - cyclic shift to the right through a carry.

The contents of the operand are shifted to the right by the number of bits specified by the shift_count operand. The shifted bits in turn become the value of the carry flag CF.

Rice. 33. Rotate Instructions via Carry Flag CF

Figure 33 shows that when shifting through the carry flag, an intermediate element appears, with the help of which, in particular, it is possible to replace cyclically shifted bits, in particular, the mismatch of bit sequences.

Hereinafter, mismatch of a bit sequence means an action that allows in some way to localize and extract the necessary sections of this sequence and write them to another place.

Additional shift commands

The command system of the latest Intel microprocessor models, starting with the i80386, contains additional shift commands that expand the capabilities we discussed earlier. These are the double precision shift commands:

1) shld operand_1, operand_2, shift_counter - double precision left shift. The shld command performs a replacement by shifting the bits of operand_1 to the left, filling its bits on the right with the values ​​of the bits displaced from operand_2 according to the diagram in Fig. 34. The number of bits to be shifted is determined by the shift_counter value, which can be in the range 0... 31. This value can be specified as an immediate operand or contained in the cl register. The value of operand_2 is not changed.

Rice. 34. The scheme of the shld command

2) shrd operand_1, operand_2, shift_counter - double precision right shift. The instruction performs the replacement by shifting the bits of the operand_1 operand to the right, filling its bits on the left with the values ​​of the bits displaced from operand_2 according to the diagram in Figure 35. The number of bits to be shifted is determined by the value of the shift_counter, which can lie in the range 0... 31. This value can be specified by the immediate operand or contained in the cl register. The value of operand_2 is not changed.

Rice. 35. The scheme of the shrd command

As we noted, the shld and shrd commands perform shifts up to 32 bits, but due to the peculiarities of specifying operands and the operation algorithm, these commands can be used to work with fields up to 64 bits long.

2. Control Transfer Commands

We got acquainted with some commands from which the linear sections of the program are formed. Each of them generally performs some action to convert or transfer data, after which the microprocessor transfers control to the next instruction. But very few programs work in such a consistent way. There are usually points in a program where a decision must be made about which instruction will be executed next. This solution could be:

1) unconditional - at this point, it is necessary to transfer control not to the command that comes next, but to another, which is at some distance from the current command;

2) conditional - the decision about which command will be executed next is made on the basis of an analysis of some conditions or data.

A program is a sequence of commands and data that occupy a certain amount of RAM space. This memory space can either be contiguous or consist of multiple fragments.

Which program instruction should be executed next, the microprocessor learns from the contents of the cs: (e) ip register pair:

1) cs - code segment register, which contains the physical (base) address of the current code segment;

2) eip/ip - instruction pointer register, which contains a value representing the offset in memory of the next instruction to be executed relative to the beginning of the current code segment.

Which particular register will be used depends on the set addressing mode use16 or use32. If use 16 is specified, then ip is used, if use32, then eip is used.

Thus, control transfer instructions change the contents of the cs and eip / ip registers, as a result of which the microprocessor selects for execution not the next program instruction in order, but an instruction in some other section of the program. The pipeline inside the microprocessor is reset.

According to the principle of operation, the microprocessor commands that provide the organization of transitions in the program can be divided into 3 groups:

1. Unconditional transfer of control commands:

1) an unconditional branch command;

2) a command to call a procedure and return from a procedure;

3) command to call software interrupts and return from software interrupts.

2. Commands for conditional transfer of control:

1) jump commands by the result of the comparison command p;

2) transition commands according to the state of a certain flag;

3) instructions for jumping through the contents of the esx/cx register.

3. Cycle control commands:

1) a command for organizing a cycle with a counter ехх/сх;

2) a command for organizing a cycle with a counter ех/сх with the possibility of early exit from the cycle by an additional condition.

Unconditional jumps

The previous discussion has revealed some details of the transition mechanism. Jump instructions modify the eip/ip instruction pointer register and possibly the cs code segment register. What exactly needs to be modified depends on:

1) on the type of operand in the unconditional branch instruction (near or far);

2) from specifying a modifier before the jump address (in the jump instruction); in this case, the jump address itself can be located either directly in the instruction (direct jump), or in a register or memory cell (indirect jump).

The modifier can take the following values:

1) near ptr - direct transition to a label inside the current code segment. Only the eip/ip register is modified (depending on the specified use16 or use32 code segment type) based on the address (label) specified in the command or an expression using the value extraction symbol - $;

2) far ptr - direct transition to a label in another code segment. The jump address is specified as an immediate operand or address (label) and consists of a 16-bit selector and a 16/32-bit offset, which are loaded into the cs and ip/eip registers, respectively;

3) word ptr - indirect transition to a label inside the current code segment. Only eip/ip is modified (by the offset value from memory at the address specified in the command, or from a register). Offset size 16 or 32 bits;

4) dword ptr - indirect transition to a label in another code segment. Both registers - cs and eip / ip - are modified (by a value from memory - and only from memory, from a register). The first word/dword of this address represents the offset and is loaded into ip/eip; the second/third word is loaded into cs. jmp unconditional jump instruction

The command syntax for an unconditional jump is jmp [modifier] jump_address - an unconditional jump without saving information about the return point.

Jump_address is the address in the form of a label or the address of the memory area in which the jump pointer is located.

In total, in the microprocessor instruction system there are several codes of machine instructions for the unconditional jump jmp.

Their differences are determined by the transition distance and the way the target address is specified. The jump distance is determined by the location of the jump_address operand. This address may be in the current code segment or in some other segment. In the first case, the transition is called intra-segment, or close, in the second - inter-segment, or distant. An intra-segment jump assumes that only the contents of the eip/ip register are changed.

There are three options for intra-segment use of the jmp command:

1) straight short;

2) straight;

3) indirect.

Процедуры

Assembly language has several tools that solve the problem of duplicating sections of code. These include:

1) mechanism of procedures;

2) macro assembler;

3) interrupt mechanism.

A procedure, often also called a subroutine, is the basic functional unit for decomposing (dividing into several parts) a task. A procedure is a group of commands for solving a specific subtask and has the means of receiving control from the point where the task of a higher level is called and returning control to this point.

In the simplest case, the program may consist of a single procedure. In other words, a procedure can be defined as a well-formed set of commands, which, being described once, can be called anywhere in the program if necessary.

To describe a sequence of commands as a procedure in assembly language, two directives are used: PROC and ENDP.

The procedure description syntax is as follows (Fig. 36).

Rice. 36. Syntax of the description of the procedure in the program

Figure 36 shows that in the procedure header (PROC directive) only the procedure name is mandatory. Among the large number of operands of the PROC directive, [distance] should be highlighted. This attribute can take the values ​​near or far and characterizes the possibility of calling the procedure from another code segment. By default, the [distance] attribute is set to near.

The procedure can be placed anywhere in the program, but in such a way that it does not randomly get control. If the procedure is simply inserted into the general instruction stream, then the microprocessor will perceive the instructions of the procedure as part of this stream and, accordingly, will execute the instructions of the procedure.

Conditional Jumps

The microprocessor has 18 conditional jump instructions. These commands allow you to check:

1) the relationship between operands with a sign ("greater - less");

2) the relation between operands without a sign ("higher - lower");

3) states of arithmetic flags ZF, SF, CF, OF, PF (but not AF).

Conditional jump commands have the same syntax:

jcc jump_label

As you can see, the mnemonic code of all commands begins with "j" - from the word jump (jump), it - determines the specific condition analyzed by the command.

As for the jump_label operand, this label can only be located within the current code segment, inter-segment transfer of control in conditional jumps is not allowed. In this regard, there is no question about the modifier, which was present in the syntax of the unconditional jump commands. In early models of the microprocessor (i8086, i80186 and i80286), conditional branch instructions could only make short jumps - from -128 to +127 bytes from the instruction following the conditional branch instruction. Starting with microprocessor model 80386, this restriction is removed, but, as you can see, only within the current code segment.

In order to make a decision about where control will be transferred to the conditional jump command, a condition must first be formed, on the basis of which the decision to transfer control will be made.

Sources of such a condition can be:

1) any command that changes the state of arithmetic flags;

2) the comparison instruction p, which compares the values ​​of two operands;

3) the state of the esx/cx register.

cmp comparison command

The page compare command has an interesting way of working. It is exactly the same as the subtraction command - sub operand, operand_2.

The p instruction, like the sub instruction, subtracts operands and sets flags. The only thing it doesn't do is write the result of the subtraction in place of the first operand.

The command syntax str - str operand_1, operand_2 (compare) - compares two operands and sets flags based on the results of the comparison.

The flags set by the p command can be analyzed by special conditional branch instructions. Before we look at them, let's pay a little attention to the mnemonics of these conditional jump instructions (Table 16). Understanding the notation when forming the name of the conditional jump commands (the element in the name of the jcc command, we designated it) will facilitate their memorization and further practical use.

Table 16. Meaning of abbreviations in the jcc command name Table 17. List of conditional jump commands for the command p operand_1, operand_2

Do not be surprised by the fact that several different mnemonic codes of conditional branch commands correspond to the same flag values ​​(they are separated from each other by a slash in Table 17). The difference in name is due to the desire of microprocessor developers to make it easier to use conditional jump instructions in combination with certain groups of instructions. Therefore, different names reflect rather a different functional orientation. However, the fact that these commands respond to the same flags makes them absolutely equivalent and equal in the program. Therefore, in Table 17 they are grouped not by name, but by the values ​​of the flags (conditions) to which they respond.

Conditional Branch Instructions and Flags

The mnemonic designation of some conditional jump instructions reflects the name of the flag with which they work, and has the following structure: the first character is "j" (Jump, jump), the second is either the flag designation or the negation character "n", followed by the name of the flag . This team structure reflects its purpose. If there is no character "n", then the state of the flag is checked, if it is equal to 1, a transition to the jump label is made. If the character "n" is present, then the flag state is checked for equality to 0, and if successful, a jump to the jump label is made.

Command mnemonics, flag names, and jump conditions are shown in Table 18. These commands can be used after any commands that modify the specified flags.

Table 18. Conditional Jump Instructions and Flags

If you look closely at tables 17 and 18, you can see that many of the conditional jump instructions in them are equivalent, since both of them are based on the analysis of the same flags.

Conditional Jump Instructions and the esx/cx Register

The architecture of the microprocessor involves the specific use of many registers. For example, the EAX / AX / AL register is used as an accumulator, and the BP, SP registers are used to work with the stack. The ECX / CX register also has a certain functional purpose: it acts as a counter in loop control commands and when working with character strings. It is possible that functionally the conditional branch instruction associated with the esx/cx register would be more correctly attributed to this group of instructions.

The syntax for this conditional branch instruction is:

1) jcxz jump_label (Jump if ex is Zero) - jump if cx is zero;

2) jecxz jump_label (Jump Equal ех Zero) - jump if ех is zero.

These commands are very useful when looping and when working with character strings.

It should be noted that there is a limitation inherent in the jcxz/jecxz command. Unlike other conditional transfer instructions, the jcxz/jecxz instruction can only address short jumps -128 bytes or +127 bytes from the instruction following it.

Cycle organization

The cycle, as you know, is an important algorithmic structure, without the use of which, probably, no program can do. You can organize the cyclic execution of a certain section of the program, for example, using the conditional transfer of control commands or the unconditional jump command jmp. With this organization of the cycle, all operations for its organization are performed manually. But, given the importance of such an algorithmic element as a cycle, the developers of the microprocessor introduced a group of three commands into the instruction system, which facilitates the programming of cycles. These instructions also use the esx/cx register as a loop counter.

Let's give a brief description of these commands:

1) loop transition_label (Loop) - repeat the loop. The command allows you to organize loops similar to for loops in high-level languages ​​with automatic decrement of the loop counter. The team's job is to do the following:

a) decrement of the ECX/CX register;

b) comparing the ECX/CX register with zero: if (ECX/CX) = 0, then control is transferred to the next command after loop;

2) loope/loopz jump_label

The loope and loopz commands are absolute synonyms. The work of commands is to perform the following actions:

a) decrement of the ECX/CX register;

b) comparing the ECX/CX register with zero;

c) analysis of the status of the zero flag ZF if (ECX/CX) = 0 or XF = 0, control is transferred to the next command after loop.

3) loopne/loopnz jump_label

The commands loopne and loopnz are also absolute synonyms. The work of commands is to perform the following actions:

a) decrement of the ECX/CX register;

b) comparing the ECX/CX register with zero;

c) analysis of the state of the zero flag ZF: if (ECX/CX) = 0 or ZF = 1, control is transferred to the next command after loop.

The loope/loopz and loopne/loopnz commands are reciprocal in their operation. They extend the action of the loop command by additionally parsing the zf flag, which makes it possible to organize an early exit from the loop, using this flag as an indicator.

The drawback of the loop, loope/loopz, and loopne/loopnz commands is that they only implement short jumps (-128 to +127 bytes). To work with long loops, you will need to use conditional jumps and the jmp instruction, so try to master both ways of organizing loops.

Author: Tsvetkova A.V.

We recommend interesting articles Section Lecture notes, cheat sheets:

World economy. Crib

Operational-search activity. Crib

Investments. Lecture notes

See other articles Section Lecture notes, cheat sheets.

Read and write useful comments on this article.

<< Back

Latest news of science and technology, new electronics:

Artificial leather for touch emulation 15.04.2024

In a modern technology world where distance is becoming increasingly commonplace, maintaining connection and a sense of closeness is important. Recent developments in artificial skin by German scientists from Saarland University represent a new era in virtual interactions. German researchers from Saarland University have developed ultra-thin films that can transmit the sensation of touch over a distance. This cutting-edge technology provides new opportunities for virtual communication, especially for those who find themselves far from their loved ones. The ultra-thin films developed by the researchers, just 50 micrometers thick, can be integrated into textiles and worn like a second skin. These films act as sensors that recognize tactile signals from mom or dad, and as actuators that transmit these movements to the baby. Parents' touch to the fabric activates sensors that react to pressure and deform the ultra-thin film. This ... >>

Petgugu Global cat litter 15.04.2024

Taking care of pets can often be a challenge, especially when it comes to keeping your home clean. A new interesting solution from the Petgugu Global startup has been presented, which will make life easier for cat owners and help them keep their home perfectly clean and tidy. Startup Petgugu Global has unveiled a unique cat toilet that can automatically flush feces, keeping your home clean and fresh. This innovative device is equipped with various smart sensors that monitor your pet's toilet activity and activate to automatically clean after use. The device connects to the sewer system and ensures efficient waste removal without the need for intervention from the owner. Additionally, the toilet has a large flushable storage capacity, making it ideal for multi-cat households. The Petgugu cat litter bowl is designed for use with water-soluble litters and offers a range of additional ... >>

The attractiveness of caring men 14.04.2024

The stereotype that women prefer "bad boys" has long been widespread. However, recent research conducted by British scientists from Monash University offers a new perspective on this issue. They looked at how women responded to men's emotional responsibility and willingness to help others. The study's findings could change our understanding of what makes men attractive to women. A study conducted by scientists from Monash University leads to new findings about men's attractiveness to women. In the experiment, women were shown photographs of men with brief stories about their behavior in various situations, including their reaction to an encounter with a homeless person. Some of the men ignored the homeless man, while others helped him, such as buying him food. A study found that men who showed empathy and kindness were more attractive to women compared to men who showed empathy and kindness. ... >>

Random news from the Archive

History of African Dust 31.07.2021

The research group, headed by an honorary professor of the School of Marine and Atmospheric Sciences. Rosenstiel at the University of Miami (UM) by Joseph Prospero, chronicles the transport of African dust, including three independent "first" discoveries of African dust in the Caribbean in the 1950s and 1960s.

Each year, mineral-rich dust from the Sahara desert in North Africa is lifted into the atmosphere by winds and carried on a 5000-mile journey across the North Atlantic to the Americas. African dust contains iron, phosphorus and other important nutrients that are essential for life in marine and terrestrial ecosystems, including the Amazon basin. Wind-blown mineral dust also plays an important role in climate by modulating solar radiation and cloud properties.

The researchers also discuss the discovery in the 1970s and 1980s of a link between dust transport and the African climate after increased dust transport to the Caribbean due to the onset of severe drought in the Sahel. Much of today's dust research is focused on North Africa, as it is the largest and most persistent source of dust on Earth.

Today, Prospero, nicknamed the "Father of Dust," uses a system of ground stations and satellites to study the effect of global transport from the Sahara on atmospheric composition over the Caribbean Sea.

Other interesting news:

▪ Combat laser of the third generation

▪ Stuffcool Snap Lightning Power Bank for Apple

▪ Ion Engine X-3

▪ The most dangerous day of the year

▪ Konica Minolta bizhub C3, C458 and C558 A658 color MFPs

News feed of science and technology, new electronics

 

Interesting materials of the Free Technical Library:

▪ section of the site Audio Art. Article selection

▪ article People of Goodwill. Popular expression

▪ article What is the Zodiac? Detailed answer

▪ article The air environment is the most important part of the working environment surrounding the worker

▪ article Equivalence of electric and magnetic antennas. Encyclopedia of radio electronics and electrical engineering

▪ article Living pencil. Focus Secret

Leave your comment on this article:

Name:


Email (optional):


A comment:





All languages ​​of this page

Home page | Library | Articles | Website map | Site Reviews

www.diagram.com.ua

www.diagram.com.ua
2000-2024