String types in future GPC versions

Tue, 28 Oct 1997 09:46:32 +0100 (MET)


>
>According to Extended Pascal, strings are written with parentheses
>("Str255 = String ( 255 )"), but GPC also accepts UCSD/Borland syntax
>("Str255 = String [ 255 ]").  In the future we *might* change this such
>that GPC produces an Extended Pascal string (unlimited length, 8 bytes
>overhead) with "(...)" but a Borland Pascal string (limited to 255 chars,
>1 byte overhead) with "[...]".
>

Wouldn't that feature (choosing between two memory allocation/outline schemes
for strings) create a lot of problems for relatively little benefit?  

First, if I have a procedure with a var string type,

  procedure foo (var s : string)

can I pass variables of either string type (Borland or Extended)?  

Second, a string variable in Borland Pascal does not contain information
about the maximum length (like the capacity field in GPC), so for procedures
that need this information (like Readln), the string capacity would have to
be passed as an implicit parameter (or am I missing something?).

Finally, and most importantly, why is feature this needed in the first place?
The Extended Pascal string concept is more powerful that the Borland one (as
it stores the string capacity in a field which my procedure `foo' can
access).  However, the extra features of GPC (capacity field, and strings
larger than 255 bytes) are *extensions* compated to Borland Pascal, so any
Borland code with string manipulations should compile under GPC, and the
binary (program) should produce the same results.

I can only think of two reasons (see below) for having a string type which is
exactly identical to the Borland type, and neither is sufficient (IMHO) to
justify the additional work for the GPC developers:

  1) A program reading string variables from a binary file created by a
Borland pascal program (.exe file).  Admittedly, this *could* only work with
an identical string type, it would be much simpler to write a conversion
program (BP string -> EP string) than adding BP-type strings to GPC (when
porting the program, of course;  perhaps, we could even put such a conversion
routine in the GPC library or BPcompat package).

  2) GPC "wastes" more memory than Borland Pascal (8 bytes overhead instead
of 1; not counting the additional 1-3 bytes needed for alignment on a 4-byte
boundary).  Obviously, this is true in a limited sense; BUT

First of all, I don't think the memory is wasted: the capacity field is
useful in many cases, and the string can contain more than 255 bytes.  These
are very useful features, once we start forgetting about Borland Pascal and
rely on GPC for our programming projects.  Alignment is important for
performance reasons.

Second, as I see things, the concerns about "wasting memory" are largely a
die-hard habit from the old days where a Turbo Pascal program (code+data) was
limited to the 640K (DOS) boundary (if you have network drivers, maybe 400K
effectively), or even the *really* old days when TP produced .COM files
(64K).  As we all know, GPC uses all the memory installed (+ virtual memory
if that's not enough), so why care about a few bytes?  With Pentium (and
later) processors, what we should care about is proper alignment, even though
that means wasting a few bytes.

Jesper Lund


Jesper Lund (jel@hha.dk)

HTML conversion by Lluís de Yzaguirre i Maura
Institut de Lingüística Aplicada - Universitat "Pompeu Fabra"
e-mail: de_yza@upf.es