OOP

Sat, 31 May 1997 22:02:45 +0200



Pierre Phaneuf wrote:

> > I agree. Also, I don't like the type-casting bit. But, AFAICS, making
> > Load a virtual method (or virtual constructor in the future?) would
> > remove all these problems. Since the VMT has been assigned at this point,
> > a virtual method can be called without any tricks.
>
> Yes, that's what I intend to do for the GPC-specific class library, but
> for the BPCOMPAT package, the Load method is a constructor, and we have to
> be compatible, so... :-(

Peter, what about making virtual constructors soon? Should be relatively
easy (I hope), just "combining" the properties of constructors and virtual
methods? ;-)

> > I know that this complicates matters, especially in the base classes, but
> > since this is mostly a one-time job, I think it's worth the effort.
>
> Actually, it simplify things... (IMHO)

In the long run, definitely!

> > Anything more (e.g. load and store methods) will not be common to all objects
> > (e.g. a mouse object probably can't be stored in a stream), so this should be
> > introduced in another class (or interface).
>
> I thought about that, having a direct TObject descendent (TPersistent),
> but the problem lied in abstract classes that has descendents that were
> streameable and others that aren't. Like say TCollection, which can be a
> collection of streameable objects (where you want to implement TCollection
> itself as a streamable class) or not. Do we descend TCollection from
> TPersistent or TObject? If we descend from TPersistent, what do we do in
> case that the TCollection contains non-streamable objects?

Perhaps like this:

TAbstractCollection(TObject)

TCollection(TAbstractCollection) works with TObject's

TStreamableCollection(TAbstractCollection,TStreamable) works with
  TStreamable TObject's

This assumes that TStreamable will be an interface, and that type
declarations like "TStreamable TObject", consisting of an interface and an
object type are allowed -- this should not be much more difficult to
implement than interface type variables, see below.

TAbstractCollection is an abstract class that implements all the
functionality (working on TObjects), but only as protected methods.

Most methods of TCollection and TStreamableCollection just call the methods
of TAbstractCollection.

The introduction of TAbstractClass seems necessary in order to prevent that
a TStreamableCollection can be cast into a TCollection -- which would allow
inserting non-streamable objects into a TStreamableCollection.

I think this is a rather difficult example, but I think with a mechanism
like what I described, it can be solved (perhaps there are a few problems
left, but they can be solved, too, hopefully). The benefit is that all the
checks (if a collection is streamable) can be made at compile time. If
collections are implemented "more simply", this check can only be made at
runtime, AFAICS, by checking if all members of a collection are streamable
(or to allow only collections of streamable objects at all) -- I suppose
Borland does it one of these ways!?

> I like having my objects "cleaned out", but this is really personal... :-)
> I guess no constructor is ok if we don't do anything, but I'd like to keep
> this... It's easy to bypass if you don't want to use this...

No objections from me, just a question:

What can be assumed about data fields after FillChar'ing them with 0?
I suppose, for all ordinary types, one can assume the value with "ord 0",
right? For real numbers, probably nothing can be assumed!? What about sets,
can [] be assumed? Pointers=NIL? ...

With BP and DO$, it's simple. The question is how much sense such a
FillChar makes on any platform.

David Fiddes wrote:

> Here's a simple example of how Delphi 3.0 implements interfaces. The main
> difference between this and what Frank was suggesting is that Delphi 3.0
> doesn't allow and fields in an interface.

Perhaps you misunderstood me. The only "fields" allowed in a Java interface
are plain constants, which, technically, are no fields, i.e. don't appear
either in the object's data or in the VMT. They're just constants that could
be declared outside of the interface declaration as well - except for scoping
differences (or, more precisely, the need for a qualified identifier to
access them outside of the object type's methods).

> This is because COM is designed
> to work across process and machine boundaries using remote procedure
> calls(RPC) but does have support for remote variable calls(if there is such
> a thing!) and this is the main reason for using interfaces...

Actually, Java is a threaded language, too (and though I think it was not
designed for distributed programs, this should be easily possible in Java),
so this might have been a consideration there, too. But I think the main
goal in Java was to avoid the problems C++ got from MI.

> this simple example of a Delphi interface is from a copy of a unit I'm
> working on to use MS ActiveScripting.
>
> the interface bit:
> const
>     SIID_IActiveScriptSiteWindow = '{D10F6761-83E9-11cf-8F20-00805F2CD064}';

I don't understand what this serves for. Perhaps Delphi's interfaces are more
than what we're discussing here, i.e. an extension to the object model.
It seems to me they're somehow interrelated to some Windoze features...
I would object to that, since language features and OS things should be
separated, IMHO.

OTOH, this might also be the "ObjID" of the interface, used to locate the
interface in the VMT, see below.

> type
>
>   IActiveScriptSiteWindow = interface(IUnknown)
>     [SIID_IActiveScriptSiteWindow]

What's IUnknown for? I'd prefer simply "interface" for a "top-level"
interface, just like "object" for a top-level object.
And, in contrast to object types, there is really no need for a "mother of
all interfaces", since there's *nothing* (no method, no constant, and of
course not the ID) that all interfaces have in common.

> You access an interface using the QueryInterface function in the IUnknown
> interface from which all interfaces are derived by passing it the
> interfaces IID.

This partly explains what IUnknown is for, and it reveals something about
how Delphi implements all of this. But, AFAICS now, I don't like this, if
the programmer has to do extra work to access the interfaces. I'd like to
have it transparently, just like in my description of Java, i.e. you can
pass (a pointer P to) any object that implements IWhatever to a procedure,
and this procedure can call P^.SomeMethod if SomeMethod is a method declared
in IWhatever. Of course, this involves some work for the compiler (discussed
below), but I still think it's possible to do, and then it's more
comfortable to use these interfaces.

Peter Gerwinski wrote:

> Compatibility to other widely-used compilers and/or well-defined standards
> is always desireable as long as it doesn't take away too much time from
> more important tasks.

... and if it doesn't add misfeatures to other people, or the misfeatures
can be turned off completely, as with:

> `--store-object-names', and switched ON in Delphi compatibility mode.

... and OFF otherwise!

> As I understand this, the Interface mechanism is MI with the following
> requirements:
> 
>   * Each object type has one "primary ancestor"; possible other ancestors
>     are "secondary".

Or no "primary ancestor" at all. It can implement some interfaces, but not
inherit from some other object type.

>   * Definition:  An object type which does not have data fields, but only
>     abstract (virtual) methods qualifies as an "interface".
> 
>   * The primary ancestor of an object may be any object; secondary
>     ancestors must be interfaces.

I'm not sure if this notion really helps. Doing it your way, the compiler
has to distinguish between regular object types and interfaces. And I think
the code gets clearer if interfaces are clearly recognizable as such.
I don't see an advantage of your way: interfaces can't be instantiated,
anyway (since they're abstract), and a (regular) object type that should be
"inherited" form an interface, can simple implement this interface and
inherit from no type.

> This implies:
> 
>   * MI among interfaces works without restrictions.

Right.

>   * Instances of interfaces are useless.

Not even allowed. (Cf. below about abstract object types.)

>   * You cannot inherit data fields and function bodies via MI.

No, you can't (only from the "primary ancestor").

>   * MI does no more cause all those catastrophies as it would be the case
>     with unrestricted MI among "real" objects.

Correct. :-)

(Of course, there can be conflicting method identifiers with interfaces that
are implemented, but these should simple generate "duplicate identifier"
errors.)

> Making these rule obligatory, it would be save to introduce MI into GPC,
> thus introducing interfaces without needing another keyword.  (Just an
> idea.)  However since Delphi seems to have interfaces, we should use their
> syntax rather than inventing another one.  (Just another idea.)

As I said above, I'd vote for the second idea ("interface" is a keyword
anyway). AFAICS, the Delphi syntax as shown in the example by David looks
like we could adopt most of it -- as I said, I'm not sure about the IUnknown
bit, but I think it can be optional (just declared as an empty interface for
compatibility reasons).

Also, I think Delphi's syntax

   IWhatever = interface(...) [ID];

looks like a convenient way to declare the ObjID, also for regular object
types.

To sum up (again) what I now think about ObjIDs:

- ObjID (or whatever it will be called) is an object constant of every
  object type and interface type.

- Its type is a 64/128 bit integer.

- It's not inherited. If not explicitly declared in a new type, its value
  will be 0. (Perhaps with a warning if the parent type has an ObjID<>0.)

- The compiler checks that no two types or interfaces in a program have the
  same ID (except 0, of course).

- For convenience, it can be declared in "[]" as above.

This removes all needs for class registration, and perhaps solves some
problems with interfaces. I think I like that!

> (* Hmm ... the above rules for "careful use of MI" could be useful for *)
> (* C++ programmers ... perhaps we should tell them?                   :*)

Hmm ... I think I know some more rules for "careful use of C[++]". Should
we tell them? Would they listen to us? Would they laugh at us? ...

> > AFAICS, the only thing that really makes problems are variables (or
> > parameters) of interface types.
>
> What's the problem?  An instance of an interface would be an empty object,
> containing nothing besides the VMT pointer.

No! There aren't any instances of interfaces!

A variable of a pointer-to-an-interface type must point ot the actual object
(which can be of any type that implements that interface), and (somehow) give
the information where in this type's VMT the methods of that interface are
located.

> > First of all, since interfaces can't be instantiated, such variables or
> > parameters must be pointers (or var parameters). In Java, this is implied,
> > since *all* object variables are pointers.
>
> Do you mean:  If an object implements an interface (in Java sense) it must
> always be accessed through a pointer?  If so, why?

No, if type T implements interface I, there can be a variable V of type T,
no problem. But V can't be of type I (since interfaces can't be
instantiated). You can, however, declare a variable P of type ^I and assign
@V to P (since V has all the properties that I demands).

This is no special rule, it follows from the fact that interfaces can't be
instantiated. The same holds for abstract object types. Assuming TObject is
abstract, there can't be a variable of type TObject, but there can be
variables of type PObject, and there can be VAR parameters of type TObject.

> > - A "pointer" to an interface variable consists of two parts: the actual
> >   pointer to the variable, and the VMT offset of the first method (or,
> >   alternatively, directly the adress of the first method in the VMT).
> >
> >   Disadvantage: The pointer gets twice as big. The difference must be
> >   considered when assigning it to another pointer (this could be an untyped
> >   pointer or a pointer of one of the "parent" interfaces - in the latter
> >   case the VMT offset has to be adjusted).
>
> I'm afraid we can forget about this for that reason.

Why? Is there a rule carved in stone that a pointer must consist only of a
memory address?

Actually, I'm going to take this a bit further (I don't think Java has this,
but why not):

Let T be any object type and In be some interfaces.

The following variable declarations could all be legal:

T
^T
^I1
^I1 T
^I1 I2
^I1 I2 T
...

In general: P can be a variable of type pointer of (n interfaces I1 .. In and
optionally one object type T).

Legal assignments to P are objects of any type that implements all I1 .. In
(and is T or a descandant of T, if T is given).

The internal representation of P consists of the actual address of the object
and n addresses that point to the first method of each of the n interfaces
inside the VMT of the actual type of the object.

I hope this was understadable so far -- if not, I can try to explain again.

> > - The VMT must contain information about all interfaces that are implemented
> >   together with the addresses of their methods. However, since different
> >   object types can implement different interfaces, and one type can
> >   implement more than one interface, I can't think of a method that doesn't
> >   involve some kind of searching (searching the wanted interface out of
> >   possibly many interfaces).
>
> That's the problem why I initially asked about MI.

The problem would not be any easier with MI.
[Proof: You showed above that interfaces are just a special case of MI. ;-]

> Does anybody know how C++ solves that problem?  Or Java?  Or Delphi?

No, but from Delphi's interface IDs I gather it uses something like the
second way. (And since it runs under Windoze, efficiency doesn't matter,
anyway... ;-)

> I agree.  Calling virtual methods works quite fast as it is implemented
> now.  If having interfaces (or MI in general) would slow this down, I would
> vote against it.

Me too.

> But this is quite an interesting problem - not a technical one, how to
> implement this-or-that without interfering with that-or-this syntax from
> another dialect.  Here we have a problem where it is not even clear that
> a fast (i.e. O(1)) solution exists.  :-(-:

There is a O(1) solution -- the first one!

It increases the memory needed for (pointer-to-)interface type variables/
parameters, but this might just be the prize we have to pay. It's O(1) in
size and speed, and it takes more space than now only when one actually uses
interfaces. Doesn't seem too bad to me!

So with the first solution, AFAICS, the ObjIDs for interfaces are not needed
(in contrary to what I said above) -- they can be accepted in Delphi
compatibility mode, but I see no need for them...

> > If you used the address everywhere you use the ID now, you would know,
> > wouldn't you?
> >
> > [...]
> >
> > Yes, but what do you need to do with IDs?
>
> The unique ID can be stored in a stream; the address cannot.

A valid point! But I think the IDs should be generated within the storing
routines, and resolved (to pointers) within the loading routines. This can
be done quite efficiently, O(n log n), perhaps O(n^2) worst case. No need
to keep the IDs during the (regular) operation, wasting memory and time.

There may be some types that need a persistent ID (i.e. one that cannot
simply regenerated with each storing, for whatever reason), but then again,
ID should be a field of these special types only.

> Use:  Think of a tree of objects holding numerical data.  A method of
> an object somewhere in that tree wants to calculate something.  For this
> purpose it needs some data stored elsewhere in the tree.  Then the
> unique ID can be used to locate that other data object.

For this purpose, I'd use a pointer to thar other object.
When storing the whole thing to a stream, the pointers can be converted to
(numerical) IDs that are unique to this data structure in this stream at this
time. While loading the stream, the IDs can be converted back to pointers.
(This takes some programming effort, but it's a one-time job! I think I could
program these conversions if necessary.)

> > Where do you get the SelfIDs from? Perhaps a list of IDs stored in a parent
> > object? You could put the addresses there instead, couldn't you?
>
> The IDs must be arranged in a way that you can read off them what kind of
> object we have.

??? Now I think you lost me!

I assume, by "kind of object" you mean its type, right?

But the type information is already there (through the VMT link), isn't it?
Any procedure can check the VMT link (together with the "IS" operator) to
examine the type of any object it has a pointer to -- and usually, type
destinctions should not be made be the caller at all, but by the called
object (through virtual methods).

If you mean in a stream: the type information will be stored by the ObjID,
which will be converted to a VMT link by the stream loader, so there's also
no need for the object (instance) ID to contain any type information.

Am I missing something?

> > I assume "pure virtual" is the same as "abstract" methods!?
> > I.e., any class that has at least one abstract method can't be instantiated.
>
> Is it correct like that?
>
>
>     Type
>       MyObj = object
>         Procedure Foo; abstract;
>       end (* MyObj *);
>
>       MySecondObj = object ( MyObj )
>         Procedure Foo; virtual;
>       end (* MySecondObj *);

Of course, MyObj could also declare some non-abstract methods and/or fields
(otherwise, it could also be an interface).

> `MySecondObj.Foo' must be implemented, and `MyObj.Foo' mustn't?

Right.

> And calls to `Foo' in instances of `MyObj' would yield a run-time error?

No -- there can't be any instances of MyObj, the compiler should check this.
That's the main goal of the whole thing: to make these checks at
compile-time, not at run-time.

The African Chief wrote:

> Nothing stopping you from having a different name for your handle. Also,
> that is one reason for having the source code. You can change anything
> that doesn't suit you.

No, no, no! (This is, you can, but you really shouldn't!)

We're talking about a class library for widespread use, aren't we?
Imagine if everybody changed the base objects: No units of different sources
would fit together!

The OOP way to do this is: if something doesn't suit you, derive a new class,
and apply all modifications you want to the new class.

> Apart from that, in other places, I would declare the
> handle as a THandle or HWnd - the meaning of which varies depending
> on the platform. In Win16 it is a Word. In Win32 it is a long integer (which
> is the same as an integer in GPC).

So don't declare it at all in TObject, but declare it where necessary with
the appropriate type.

> You wouldn't. And that is the crux of the matter. In CHIEFAPP define a
> GetObjectCount function which returns the number of currently active
> objects. I can loop through these and do any number of things with
> the information - here is a trivial example;
>
> for i := 1 to GetObjectCount do begin
>     p := InstanceFromSelfID ( i );

So the ID can be a pointer! Then InstanceFromSelfID would be a simple
lookup in an array (or whatever structure is used to hold the IDs).

>     If Assigned ( p ) then begin
>        If p^.Name = 'CHIEFDIALOG' then { blah blah }

better: If Typeof(p^) = Typeof(TChiefDialog) then { blah blah }

Removes the need for Name field, also pointer comparisons are (usually)
faster than string comparisons.

>        else If p^.Name = 'CHIEFCONTROL' then {blah blah}

Or better yet (depending on the situation): declare a virtual method in a
common ancestor of ChiefDialog and ChiefControl (or an interface implemented
by both of them), and just call this method without any if...then's.

> > And what kind of things would you do with an unique integer ID?
>
> Inter-process communication and message passing for one.

No problem with pointers! (Assuming the objects reside in some kind of shared
memory, but otherwise an integer ID would be quite useless as well.)

> Saving
> and storing things for another - the objects themselves, the state of
> the desktop, the whole environment in which the program is running,
> etc.

See above (my reply to Peter).

> >> >BTW: Would it be possible at all? Doesn't TObject refer to TStream?
> >> No, it doesn't. In bpcompat, TObject has a "Load" constructor that takes
> >> a TObject parameter.
> >
> >But doesn't it have to refer to the stream somehow? Do you type-cast the
> >parameter into a stream
>
> Actually, the Load constructor in TObject does nothing at all.

Yes, you're right. With Load being a constructor, and explicitly declaring
the StreamRecs, this works. BTW: Does TObject need a Load constructor at all?

(If Load is going to be a virtual method or virtual constructor later, it has
to be declared at a base type (or interface), but not in TObject.)

> >> That assumes that you already know the object and its address.
> >
> >If you used the address everywhere you use the ID now, you would know,
> >wouldn't you?
>
> SelfID can be accessed at random - e.g., through a loop. You can't
> do that with addresses.

Why not?

I suppose you're looping through an array of IDs, right? You can loop
through an array of pointers as well.

> >Where do you get the SelfIDs from? Perhaps a list of IDs stored in a parent
> >object? You could put the addresses there instead, couldn't you?
>
> There is a local instance in the OBJECTS unit which allocates an ID
> each time INIT or CREATE  is called. There is a function : "InstanceFromSelfID"
> which returns a pObject for any given SelfID or NIL if there isn't any active
> object with that SelfID.

What do you mean by "allocate"? Do you have a global "collection" of all
"active" (=existing?) objects, with their addresses and IDs?
(Side note: I don't think that's a good idea. In OOP, there's hardly a need
for global data. If this collection is needed at all, it might be better put
into a "parent object".)

I still don't see any problem. Can't you just remove any "ID" fields (that
contain the ID of the object itself), and change any references to other
obejcts' IDs into pointers to them? The "collection" above would then contain
all the addresses of active objects, and you could easily loop through them.

> So, I could really just do something randomly like this;
>
>    p := InstanceFromSelfID ( 5 );

This means "give me a pointer to the object with ID 5", or what?
With the changes above, you would already have the pointer instead of the
ID 5, so this function call would even get superfluous.

If you mean "give me the pointer to the fifth object [of whatever]", the
function call would stay the same, of course.

> Like I said, OBJECTS.PAS is a cannibalised form of the main objects
> unit in another project - a portable class library ( BP 7.x, Delphi 1 and
> Delphi 2) - viz - using only pure Pascal and the Windows API - no OWL,
> and no VCL.

Well, BPs objects are not very much standard or "pure Pascal", BTW!
But you could do what I just said with BP as well.

> A lot of  that one was implemented from the perspective of
> the Windows API  and communications between all sorts of things. It
> will take too much time to explain it here.

I think Windoze is a bad example! It does pretty much anything with numerical
handles, but this is not how it should be (IMHO) in a typed language
as Pascal!

Well, interfacing the model that we'll develop to Windoze might be some
additional work then, but this shouldn't stop us from designing gpc well.

> >Right, everything that *every* object should have, not everything that *some*
> >objects should have. E.g. handles: not every object has a handle, so it should
> >not be in the base class. If in a windowing system, every instance of TWindow
> >and its childs will have a handle, then Handle should be a field of TWindow.
>
> Let's put it this way. If I declare; "Var p : pObject", I would like to be well assured
> that p^ will always have a Handle, a Name, and a SelfID. I can then use p in any
> number of situations, without having to declare another pWindow just to get
> something that has a Handle. I can find out the TYPE of p by its Name.

No, you can find out the type of p by checking Typeof(p^).
After all, the basic operations of microprocessors work on numbers, not on
strings, and we should try to stick to that. In other words: strings (like
names) are there for interfacing the computer to humans, but (usually) not
for internal purposes of the computer, IMHO. (If the others decide otherwise,
this will mean that ObjID becomes a string instead of a number, well ok,
then this would be in TObject.)

SelfID: see above -- the address suffices.

Handle: No! Not every TObject has a handle. Actually, I think, in a UI
that's served within the program itself (like TV - in contrast to windowing
systems), you don't need any handles at all. The SelfID, and therefore the
address, does all you need, doesn't it? AFAIS, Borland's TV doesn't use
handles.

So again, if you are in an environment (like Windoze) where some objects have
handles, and you have a situation where you want to do something with the
handle of an object, you declare the variable (or parameter) of an
appropriate type that has a handle, but not the base class (TObject), and
certainly not a class that will be used in other environments, too.
(Otherwise, soon TObject would probably include things like a file name, an
URL, a creation/modification/access date/time, perhaps access rights, the
address of an X server, terminal settings and whatever more...)

> >> They could all have a Name, a SelfID, a Parent, and a Child.
> >
> >Do you mean a name/ID/parent/child of the *class*, or of the *instance*?
>
> Of the class.
>
> >In the former case, I agree (except for the name, which IMHO usually isn't
> >needed at runtime, see above),
>
> I have read all the discussion about VMTs. I still prefer this;
> 	If p^.Name = 'TCHIEF' then ....

The object constants (stored in the VMT) should be accessible like this.
(At least that's why I have in mind.)

Except that Name will not be automatically defined (only in Delphi-mode or
if explicitly switched on).

If things are done as I suggested, you could do things like:

type t=object
         const c:integer=2; {stored in VMT of x}
         var cv:integer=3;  {class variable; stored in VMT of x; syntax???}
         v:integer;         {stored in data area of o}
       end;

var o:t;
...
o.v:=o.cv+o.v;
...

> >Why not simply derive one (or several) classes from it with whatever you
> >want? That would be in the spirit of OOP (not to change existing classes,
> >but just derive new ones from them).
>
> One of the things I hate about some frameworks that will remain
> nameless is the multitude of ancestors. It makes it really tedious
> when you are trying to find out what is really going on in an object
> (sometimes having to plough through myriads of objects in myriads
> of units). Sometimes inheritance can be taken too far.

You seem to be doing the other extreme. The former might be tedious and
confusing to use, especially at first sight, and it requires good
documentation, but the latter can lead to real problems if code from
different sources doesn't fit together.

Pierre Phaneuf wrote:

> Yes? Simply use the address instead of the ID, will do exactly the same
> thing.
>
> [...]
>
> Hmm... Could you give me an example of an integer (or better yet, the
> instance pinter) not being flexible enough? And how do you plan to do your
> string system??
> 
> [...]
>
> A wide need for this information??? Just give me a single example where
> knowing the class name at runtime would be useful, not taking into account
> debugging purposes, where you could/should use a debugger or normal
> constants.
>
> [...]
>
> MyObj.Foo shouldn't be implemented, exactly. But having a call to
> MyObj.Foo generate a run-time error would be exactly like how it is done
> now in BP. It is simple: a class containing one or more "abstract" methods
> shouldn't be allowed to be instantiated. Presently you *can* instantiate a
> TStream object, but this will yield a run-time error as soon as one of the
> abstract method are called (because their implementation consist of a call
> to Abstract, which has a call RunError()). If TStream would use an
> "abstract" keyword to indicate its abstract methods, it wouldn't even be
> allowed to be instantiated. The compiler would yield a compile-time error
> where possible (so that code instantiating TStream wouldn't even compile)
> and generate a run-time error where the illegal instantiation couldn't be
> detected at compile-time. (with our AssignType() function for example)

Pierre, I guees people are starting to think we're the same person writing
from two different accounts... ;-)
-- 
Frank Heckenbach, Erlangen, Germany
heckenb@mi.uni-erlangen.de
Turbo Pascal:   http://www.mi.uni-erlangen.de/~heckenb/programs.htm
Internet links: http://www.mi.uni-erlangen.de/~heckenb/links.htm


Frank Heckenbach (heckenb@mi.uni-erlangen.de)

HTML conversion by Lluís de Yzaguirre i Maura
Institut de Lingüística Aplicada - Universitat "Pompeu Fabra"
e-mail: de_yza@upf.es