String types in future GPC versions_(re)

Tue, 28 Oct 1997 20:15:26 GMT



Jesper Lund wrote:

> >According to Extended Pascal, strings are written with parentheses
> >("Str255 = String ( 255 )"), but GPC also accepts UCSD/Borland syntax
> >("Str255 = String [ 255 ]").  In the future we *might* change this such
> >that GPC produces an Extended Pascal string (unlimited length, 8 bytes
> >overhead) with "(...)" but a Borland Pascal string (limited to 255 chars,
> >1 byte overhead) with "[...]".
> >
> 
> Wouldn't that feature (choosing between two memory allocation/outline schemes
> for strings) create a lot of problems

Yes.

> for relatively little benefit?

I don't think so.

> First, if I have a procedure with a var string type,
> 
>   procedure foo (var s : string)
> 
> can I pass variables of either string type (Borland or Extended)?  

No. But we're discussing something like "AnyString" parameters for which you
will be able to pass any string type.

First of all, there are already now three different string types in GPC,
namely the (EP) string schema, the (SP/BP) arrays of char and the "CStrings"
(necessary for interfacing with C functions, esp. system functions).

These string types are already handled in some places (e.g. the Read/Write
functions), and will have to be handled in other places, too. So what we're
talking about here is actually the introduction of a fourth string type...

Since the distinction between the string types is a lot of work, and often
a bit chaotic, we're looking for ways to simplify things. I hope, the
AnyString parameters will be such a way. If things go as I hope, the
distinction will only have to be done at a few places in the compiler rather
than in each procedure operating on strings. When we aachive this for the
three existing string types, adding a fourth one should not be so much
work.

So, we're not overly keen on doing this work, but it seems inevitable.

> Second, a string variable in Borland Pascal does not contain information
> about the maximum length (like the capacity field in GPC), so for procedures
> that need this information (like Readln), the string capacity would have to
> be passed as an implicit parameter (or am I missing something?).

You're right. But it's the same for arrays of char and CStrings, where the
capacity is passed separately already now. In the future, the AnyString
parameters should cater for this.

> Finally, and most importantly, why is feature this needed in the first place?
> The Extended Pascal string concept is more powerful that the Borland one (as
> it stores the string capacity in a field which my procedure `foo' can
> access).  However, the extra features of GPC (capacity field, and strings
> larger than 255 bytes) are *extensions* compated to Borland Pascal, so any
> Borland code with string manipulations should compile under GPC, and the
> binary (program) should produce the same results.

Not any code, unfortunately. It's not uncommon in BP to manipulate the string
length by accessing "s[0]", lacking a proper way to access the length. (GPC
will soon provide a better way in the form of "AssignLength", working on all
string types...) (You might say that directly accessing the length should not
be done in a good program, but it's necessary for efficiency, e.g. in order
to append one character to a string. The proper way "s := s + ch" will produce
code that copies the whole string around at least twice, in BP as well as
(currently) in GPC.)

Another example would be dynamically allocating strings of varying length in
BP which is usually done (again, lacking a better way) with statements like:

  New(p,Length(s)+1);
  p^:=s;

To make such statements work under GPC, it's important to have the exact
string type as BP does.

-- 
Frank Heckenbach, Erlangen, Germany
heckenb@mi.uni-erlangen.de
http://home.pages.de/~fjf/links.htm


Frank Heckenbach (heckenb@mi.uni-erlangen.de)

HTML conversion by Lluís de Yzaguirre i Maura
Institut de Lingüística Aplicada - Universitat "Pompeu Fabra"
e-mail: de_yza@upf.es