Files
eiffel-org/documentation/current/method/eiffel-tutorial-et/et-other-mechanisms.wiki
halw 7020d63f69 Author:halw
Date:2008-10-23T14:51:56.000000Z


git-svn-id: https://svn.eiffel.com/eiffel-org/trunk@95 abb3cda0-5349-4a8f-a601-0c33ac3a8c38
2008-10-23 14:51:56 +00:00

272 lines
23 KiB
Plaintext

[[Property:title|10 Other Mechanisms]]
[[Property:link_title|ET: Other Mechanisms]]
[[Property:weight|-4]]
[[Property:uuid|c0a01664-194c-4e84-0517-8e7c1ca61dec]]
We now examine a few important mechanisms that complement the preceding picture: shared objects; constants; instructions; and lexical conventions.
==Once routines and shared objects==
The Eiffel's method obsession with extendibility, reusability and maintainability yields, as has been seen, modular and decentralized architectures, where inter-module coupling is limited to the strictly necessary, interfaces are clearly delimited, and all the temptations to introduce obscure dependencies, in particular global variables, have been removed. There is a need, however, to let various components of a system access common objects, without requiring their routines to pass these objects around as arguments (which would only be slightly better than global variables). For example various classes may need to perform output to a common "console window", represented by a shared object.
Eiffel addresses this need through an original mechanism that also takes care of another important issue, poorly addressed by many design and programming approaches: initialization. The idea is simple: if instead of <code>do</code> the implementation of an effective routine starts with the keyword <code>once</code>, it will only be executed the first time the routine is called during a system execution (or, in a multi-threaded environment, the first time in each thread), regardless of what the caller was. Subsequent calls from the same caller or others will have no effect; if the routine is a function, it will always return the result computed by the first call -- object if an expanded type, reference otherwise.
In the case of procedures, this provides a convenient initialization mechanism. A delicate problem in the absence of a <code>once</code> mechanism is how to provide the users of a library with a set of routines which they can call in any order, but which all need, to function properly, the guarantee that some context had been properly set up. Asking the library clients to precede the first call with a call to an initialization procedure <code>setup</code> is not only user-unfriendly but silly: in a well-engineered system we will want to check proper set-up in every of the routines, and report an error if necessary; but then if we were able to detect improper set-up we might as well shut up and set up ourselves (by calling <code>setup</code>). This is not easy, however, since the object on which we call <code>setup</code> must itself be properly initialized, so we are only pushing the problem further. Making <code>setup</code> a <code>once</code> procedure solves it: we can simply include a call
<code>
setup
</code>
at the beginning of each affected routine; the first one to come in will perform the needed initializations; subsequent calls will have, as desired, no effect.
Once functions will give us shared objects. A common scheme is
<code>
console: WINDOW
-- Shared console window
once
create Result.make (...)
end
</code>
Whatever client first calls this function will create the appropriate window and return a reference to it. Subsequent calls, from anywhere in the system, will return that same reference. The simplest way to make this function available to a set of classes is to include it in a class <code>SHARED_STRUCTURES</code> which the classes needing a set of related shared objects will simply inherit.
For the classes using it, <code>console</code>, although a function, looks very much as if it were an attribute -- only one referring to a shared object.
The "Hello World" system at the beginning of this discussion (section [[4 Hello World|4]] ) used an output instruction of the form <code>io</code>. <code>put_string ( Some string )</code>. This is another example of the general scheme illustrated by <code>console</code>. Feature <code>io</code>, declared in <code>ANY</code> and hence usable by all classes, is a once function that returns an object of type <code>STANDARD_FILES</code> (another Kernel Library class) providing access to basic input and output features, one of which is procedure <code>put_string</code>. Because basic input and output must all work on the same files, <code>io</code> should clearly be a <code>once</code> function, shared by all classes that need these mechanisms.
==Constant attributes==
The attributes studied earlier were variable: each represents a field present in each instance of the class and changeable by its routines.
It is also possible to declare constant attributes, as in
<code>
Solar_system_planet_count: INTEGER = 9
</code>
These will have the same value for every instance and hence do not need to occupy any space in objects at execution time. (In other approaches similar needs would be addressed by symbolic constants, as in Pascal or Ada, or macros, as in C.)
What comes after the <code>=</code> is a manifest constant: a self-denoting value of the appropriate type. Manifest constants are available for integers, reals (also used for doubles), booleans ( <code>True</code> and <code>False</code>), characters (in single quotes, as <code>'A'</code>, with special characters expressed using a percent sign as in <code>'%N'</code> for new line, <code>'%B'</code> for backspace and <code>'%U'</code> for null).
Manifest constants are also available for strings, using double quotes as in
<code>
User_friendly_error_message: STRING = "Go get a life !"
</code>
with special characters again using the <code>%</code> codes. It is also possible to declare manifest arrays using double angle brackets:
<code>
<<1, 2, 3, 5, 7, 11, 13, 17, 19>>
</code>
which is an expression of type <code>ARRAY [INTEGER]</code>. Manifest arrays and strings are not atomic, but denote instances of the Kernel Library classes <code>STRING</code> and <code>ARRAY</code>, as can be produced by once functions.
==Instructions==
Eiffel has a remarkably small set of instructions. The basic computational instructions have been seen: creation, assignment, assignment attempt, procedure call, retry. They are complemented by control structures: conditional, multi-branch, loop, as well as debug and check.
A conditional instruction has the form
<code>
if ... then
...
elseif ... then
...
else
...
end
</code>
The <code>elseif</code> ... <code>then</code> ... part (of which there may be more than one) and the <code>else</code> ... part are optional. After <code>if</code> and <code>elseif</code> comes a boolean expression; after <code>then</code>, <code>elseif</code> and <code>else</code> come zero or more instructions.
A multi-branch instruction has the form
<code>
inspect
''exp''
when ''v1'' then
''inst1''
when ''v2'' then
''inst2''
...
else
''inst0''
end
</code>
where the <code>else ''inst0''</code> part is optional, <code>exp</code> is a character or integer expression, ''v1'', ''v1'', ... are constant values of the same type as <code>exp</code>, all different, and ''inst0'', ''inst1'', ''inst2'', ... are sequences of zero or more instructions.
The effect of such a multi-branch instruction, if the value of <code>exp</code> is one of the ''vi'', is to execute the corresponding ''insti''. If none of the ''vi'' matches, the instruction executes ''inst0'', unless there is no <code>else</code> part, in which case it triggers an exception.
{{note|Raising an exception is the proper behavior, since the absence of an <code>else</code> indicates that the author asserts that one of the values will match. If you want an instruction that does nothing in this case, rather than cause an exception, use an <code>else</code> part with an empty ''inst0''. In contrast, <code>if c then</code> ''inst'' <code>end</code> with no <code>else</code> part does nothing in the absence of an <code>else</code> part, since in this case there is no implied claim that <code>c</code> must hold. }}
The loop construct has the form
<code>
from
''initialization''
until
''exit''
invariant
''inv''
variant
''var''
loop
''body''
end
</code>
where the <code>invariant</code> ''inv'' and <code>variant</code> ''var'' parts are optional, the others required. ''initialization'' and ''body'' are sequences of zero or more instructions; ''exit'' and ''inv'' are boolean expressions (more precisely, ''inv'' is an assertion); ''var'' is an integer expression.
The effect is to execute ''initialization'', then, zero or more times until ''exit'' is satisfied, to execute ''body''. (If after ''initialization'' the value of ''exit'' is already true, ''body'' will not be executed at all.) Note that the syntax of loops always includes an initialization, as most loops require some preparation. If not, just leave ''initialization''> empty, while including the <code>from</code> since it is a required component.
The assertion ''inv'', if present, expresses a '''loop invariant''' (not to be confused with class invariants). For the loop to be correct, ''initialization'' must ensure ''inv'', and then every iteration of ''body'' executed when ''exit'' is false must preserve the invariant; so the effect of the loop is to yield a state in which both ''inv'' and ''exit'' are true. The loop must terminate after a finite number of iterations, of course; this can be guaranteed by using a '''loop variant''' ''var''. It must be an integer expression whose value is non-negative after execution of ''initialization'', and decreased by at least one, while remain non-negative, by any execution of ''body'' when ''exit'' is false; since a non-negative integer cannot be decreased forever, this ensures termination. The assertion monitoring mode, if turned on at the highest level, will check these properties of the invariant and variant after initialization and after each loop iteration, triggering an exception if the invariant does not hold or the variant is negative or does not decrease.
An occasionally useful instruction is <code>debug</code> <code>(</code>''Debug_key'', ... <code>)</code> ''instructions'' <code>end</code> where ''instructions'' is a sequence of zero or more instructions and the part in parentheses is optional, containing if present one or more strings, called debug keys. The EiffelStudio compiler lets you specify the corresponding <code>debug</code> compilation option: <code>yes</code>, <code>no</code>, or an explicit debug key. The ''instructions'' will be executed if and only if the corresponding option is on. The obvious use is for instructions that should be part of the system but executed only in some circumstances, for example to provide extra debugging information.
The final instruction is connected with Design by Contract&#153;. The instruction
<code>
check
Assertion
end
</code>, where Assertion is a sequence of zero or more assertions, will have no effect unless assertion monitoring is turned on at the <code>Check</code> level or higher. If so it will evaluate all the assertions listed, having no further effect if they are all satisfied; if any one of them does not hold, the instruction will trigger an exception.
This instruction serves to state properties that are expected to be satisfied at some stages of the computation -- other than the specific stages, such as routine entry and exit, already covered by the other assertion mechanisms such as preconditions, postconditions and invariants. A recommended use of <code>check</code> involves calling a routine with a precondition, where the call, for good reason, does not explicitly test for the precondition. Consider a routine of the form
<code>
r (ref: SOME_REFERENCE_TYPE)
require
not_void: ref /= Void
do
ref.some_feature
...
end
</code>
Because of the call to <code>some_feature</code>, the routine will only work if its precondition is satisfied on entry. To guarantee this precondition, the caller may protect it by the corresponding test, as in
<code>
if x /= Void then
a.r (x)
end
</code>
but this is not the only possible scheme; for example if an <code>create x</code> appears shortly before the call we know <code>x</code> is not void and do not need the protection. It is a good idea in such cases to use a <code>check</code> instruction to document this property, if only to make sure that a reader of the code will realize that the omission of an explicit test (justified or not) was not a mistake. This is particularly appropriate if the justification for not testing the precondition is less obvious. For example <code>x</code> could have been obtained, somewhere else in the algorithm, as <code>clone (y)</code> for some <code>y</code> that you know is not void. You should document this knowledge by writing the call as
<code>
check
x_not_void: x /= Void end
-- Because x was obtained as a clone of y,
-- and y is not void because [etc.]
end
a.r (x)
</code>
{{recommended|An extra indentation of the <code>check</code> part to separate it from the algorithm proper; and inclusion of a comment listing the rationale behind the developer's decision not to check explicitly for the precondition. }}
In production mode with assertion monitoring turned off, this instruction will have no effect. But it will be precious for a maintainer of the software who is trying to figure out what it does, and in the process to reconstruct the original developer's reasoning. (The maintainer might of course be the same person as the developer, six months later.) And if the rationale is wrong somewhere, turning assertion checking on will immediately uncover the bug.
==Obsolete features and classes==
One of the conditions for producing truly great reusable software is to recognize that although you should try to get everything right the first time around you won't always succeed. But if "good enough" may be good enough for application software, it's not good enough, in the long term, for reusable software. The aim is to get ever closer to the asymptote of perfection. If you find a better way, you must implement it. The activity of generalization, discussed as part of the lifecycle, doesn't stop at the first release of a reusable library.
This raises the issue of backward compatibility: how to move forward with a better design, without compromising existing applications that used the previous version?
The notion of obsolete class and feature helps address this issue. By declaring a feature as <code>obsolete</code>, using the syntax
<code>
enter (i: INTEGER; x: G)
obsolete
"Use ` put (x, i)' instead "
require
...
do
put (x, i)
end
</code>
you state that you are now advising against using it, and suggest a replacement through the message that follows the keyword <code>obsolete</code>, a mere string. The obsolete feature is still there, however; using it will cause no other harm than a warning message when someone compiles a system that includes a call to it. Indeed, you don't want to hold a gun to your client authors' forehead (''"Upgrade now or die !"''); but you do want to let them know that there is a new version and that they should upgrade at their leisure.
Besides routines, you may also mark classes as obsolete.
The example above is a historical one, involving an early change of interface for the EiffelBase library class <code>ARRAY</code>; the change affected both the feature's name, with a new name ensuring better consistency with other classes, and the order of arguments, again for consistency. It shows the recommended style for using <code>obsolete</code>: <br/>
* In the message following the keyword, explain the recommended replacement. This message will be part of the warning produced by the compiler for a system that includes the obsolete element.
* In the body of the routine, it is usually appropriate, as here, to replace the original implementation by a call to the new version. This may imply a small performance overhead, but simplifies maintenance and avoids errors.
It is good discipline not to let obsolete elements linger around for too long. The next major new release, after a suitable grace period, should remove them.
The design flexibility afforded by the <code>obsolete</code> keyword is critical to ensure the harmonious long-term development of ambitious reusable software.
==Creation variants==
The basic forms of creation instruction, and the one most commonly used, are the two illustrated earlier ( [[6 The Dynamic Structure: Execution Model#Creating_and_initializing_objects|"Creating and initializing objects"]] ):
<code>
create x.make (2000)
create x
</code>
the first one if the corresponding class has a <code>create</code> clause, the second one if not. In either form you may include a type name in braces, as in
<code>
create {SAVINGS_ACCOUNT} x.make (2000)
</code>
which is valid only if the type listed, here <code>SAVINGS_ACCOUNT</code>, conforms to the type of <code>x</code>, assumed here to be <code>ACCOUNT</code>. This avoids introducing a local entity, as in
<code>
local
xs: SAVINGS_ACCOUNT
do
create xs.make (2000)
x := xs
...
</code>
and has exactly the same effect. Another variant is the '''creation expression''', which always lists the type, but returns a value instead of being an instruction. It is useful in the following context:
<code>
some_routine (create {ACCOUNT}.make (2000))
</code>
which you may again view as an abbreviation for a more verbose form that would need a local entity, using a creation instruction:
<code>
local
x: ACCOUNT
do
create x.make (2000)
some_routine (x)
...
</code>
Unlike creation instructions, creation expressions must always list the type explicitly, <code>{ACCOUNT}</code> in the example. They are useful in the case shown: creating an object that only serves as an argument to be passed to a routine. If you need to retain access to the object through an entity, the instruction <code>create x</code> ... is the appropriate construct.
The creation mechanism gets an extra degree of flexibility through the notion of <code>default_create</code>. The simplest form of creation instruction, <code>create x</code> without an explicit creation procedure, is actually an abbreviation for <code>create x.default_create</code>, where <code>default_create</code> is a procedure defined in class <code>ANY</code> to do nothing. By redefining <code>default_create</code> in one of your classes, you can ensure that <code>create x</code> will take care of non-default initialization (and ensure the invariant if needed). When a class has no <code>create</code> clause, it's considered to have one that lists only <code>default_create</code>. If you want to allow <code>create x</code> as well as the use of some explicit creation procedures, simply list <code>default_create</code> along with these procedures in the <code>create</code> clause. To disallow creation altogether, include an empty <code>create</code> clause, although this technique is seldom needed since most non-creatable classes are deferred, and one can't instantiate a deferred class.
One final twist is the mechanism for creating instances of formal generic parameters. For <code>x</code> of type <code>G</code> in a class <code>C [G]</code>, it wouldn't be safe to allow <code>create x</code>, since <code>G</code> stands for many possible types, all of which may have their own creation procedures. To allow such creation instructions, we rely on constrained genericity. You may declare a class as
<code>
[G -> T create cp end]
</code>
to make <code>G</code> constrained by <code>T</code>, as we learned before, and specify that any actual generic parameter must have <code>cp</code> among its creation procedures. Then it's permitted to use <code>create x.cp</code>, with arguments if required by <code>cp</code>, since it is guaranteed to be safe. The mechanism is very general since you may use <code>ANY</code> for <code>T</code> and <code>default_create</code> for <code>cp</code>. The only requirement on <code>cp</code> is that it must be a procedure of <code>T</code>, not necessarily a creation procedure; this permits using the mechanism even if <code>T</code> is deferred, a common occurrence. It's only descendants of <code>T</code> that must make <code>cp</code> a creation procedure, by listing it in the <code>create</code> clause, if they want to serve as actual generic parameters for <code>C</code>.
==Tuple types==
The study of genericity described arrays. Another common kind of container objects bears some resemblance to arrays: sequences, or "tuples", of elements of specified types. The difference is that all elements of an array were of the same type, or a conforming one, whereas for tuples you will specify the types we want for each relevant element. A typical tuple type is of the form
<code>
TUPLE [X, Y, Z]
</code>
denoting a tuple of least three elements, such that the type of the first conforms to <code>X</code>, the second to <code>Y</code>, and the third to <code>Z</code>.
You may list any number of types in brackets, including none at all: <code>TUPLE</code>, with no types in brackets, denotes tuples of arbitrary length.
{{info|The syntax, with brackets, is intentionally reminiscent of generic classes, but <code>TUPLE</code> is a reserved word, not the name of a class; making it a class would not work since a generic class has a fixed number of generic parameters. You may indeed use <code>TUPLE</code> to obtain the effect of a generic class with a variable number of parameters. }}
To write the tuples themselves -- the sequences of elements, instances of a tuple type -- you will also use square brackets; for example
<code>
[x1, y1, z1]
</code>
with <code>x1</code> of type <code>X</code> and so on is a tuple of type <code>TUPLE [X, Y, Z]</code>.
The definition of tuple types states that <code>TUPLE [X1 ... Xn]</code> denotes sequences of at least <code>n</code> elements, of which the first <code>n</code> have types respectively conforming to <code>X1, ..., Xn</code>. Such a sequence may have more than <code>n</code> elements.
Features available on tuple types include <code>count: INTEGER</code>, yielding the number of elements in a tuple, <code>item (i: INTEGER): ANY</code> which returns the <code>i</code>-th element, and <code>put</code> which replaces an element.
Tuples are appropriate when these are the only operations you need, that is to say, you are using sequences with no further structure or properties. Tuples give you "anonymous classes" with predefined features <code>count</code>, <code>item</code> and <code>put</code>. A typical example is a general-purpose output procedure that takes an arbitrary sequence of values, of arbitrary types, and prints them. It may simply take an argument of type <code>TUPLE</code>, so that clients can call it under the form
<code>
write ([your_integer, your_real, your_account])
</code>
As soon as you need a type with more specific features, you should define a class.