Modifying string length?_(re)

Thu, 1 May 1997 04:12:47 +0200



The African Chief wrote:

> >> It would not be too difficult to make "MyStr.length" and "MyStr.String" 
> >> accessible "record fields" of strings.  Would it be desireable?
> >
> >Yes! (At least length; the String part is accessible anyway with s[n].)
> >
> >>  I am
> >> not sure how this fits into the standard ...
> > 
> >I'm not, either. If it conflicts, it should be disabled with
> >"--extended-pascal", I suppose.
> 
>  It should be disabled, with "-borlland-pascal" as well.

But how would you translate references to s[0] in BP then? I know, there
shouldn't be any in most programs, but sometimes it can be useful, e.g. for
efficiency. But it's ok for me if it's only enabled with {$X+}, see below.

Peter Gerwinski wrote:

> 2) In GPC strings, the schema discriminant "Capacity" is accessible
>    as a "record field" of the string.  Is this okay by ISO 10206,
>    and should it be done for other schemata this way, too?  And how to
>    tread the "length" field of a string?  ISO 10206 6.4.3.3 does not
>    say anything about that.  I could imagine the string schema type to
>    be something similar to
> 
>        Type
>          String ( Capacity: Integer ) = record
>            length: Integer;
>            String: packed array [ 1..Capacity ] of Char;
>          end (* String *);
> 
>    except that the "String" field is automatically dereferenced.

Oh - I thought a string was like this already. (My previous mails were based
on this assumption.) I think I read this somewhere, but don't remember where.

But, a *packed* array? If I remember correctly, it's not possible to get an
address of a component of a packed record/array (because they could possibly
not start on a byte boundary) (but when trying it, I encountered some
problems, see below).

If this is so (and won't be changed so that it is possible to get an address
of those components that actually start on a byte boundary), I'd vote for an 
un-packed array here. BTW: Is there any advantage in packing here? AFAICS (at
least with Linux), an un-packed array of char is exactly as big as a packed
one. Is this not always so?

> It's "1..Capacity + 1".  (Increasing the index by one adds one char.)
> > +1 to allow for a trailing zero (or zero's in case of Unicode...).

Does this mean strings will always have an additional trailing #0? Do the
string routines currently support this (i.e. automatically add a #0 at the
end; but they should not recognize #0 as a terminator, but only rely on the
length field)? That would be good, because it simplifies converting a string
into a CString a lot (just "CString(@s[1])", right?).
 
> What about the following:
> 
>   * Default mode (and Extended Pascal mode):  The length field cannot
>     be accessed (except with "length ( MyStr )".)
> 
>   * With extended syntax (*$X+*):  The length field can be read- and
>     write-accessed.
> 
> Okay like that?  Or perhaps better to have it in default mode, too,
> and to switch it OFF in Extended Pascal mode?  I think it's best as
> above because you should assign a new length to a string only if you
> know exactly what you are doing.

Sounds good! {$X} is a local switch, isn't it (so one can turn it on and off
around code that needs it, and be safe from accidents elsewhere). It would be
even better than BP (where accidental writes to s[0] are not even detected
by range checking). :-)

BTW: Should the field be called "length" like the function length? It might
give conflicts with "with" statements ("with" works with schemata, doesn't 
it?) as in:

with dest_str do
 begin
  length:=length(src_str);
  ...
 end;

Perhaps something like str_length or such?

> (* By the way:  Good news for all friends of schemata!  They might be
> finished (except, of course, bugs) this week ... *)

What? The bugs will not be finished? What an omission! ;-)


As I said above, I had some problems with the new packed arrays. I tried the
following:

program t;
var
 q:packed array [1..10] of boolean;
 y:^boolean;
begin
 y:=@q[3]
end.

and it compiled (though I think it shouldn't), giving quite a strange result
in y.

Then I did:

program t;
var
 q:packed array [1..10] of boolean;
 y:^boolean;
 i:integer;
begin
 for i:=1 to 10 do y:=@q[i]
end.

gpc said "invalid lvalue in unary &".

Though this shoudln't compile, this error message seems strange to me.
I'd have expected "Attempt to take address of bit-field structure member",
as with packed records.

Next I tried:

program t;
var
 q:packed array [1..10] of boolean;
 y:^boolean;
 i:integer;
begin
 for i:=1 to 10 do y:=@(q[i])
end.

The result was: "parse error before '('" and some more error messages.
(The same with an un-packed array here.)

Should the "()" be allowed here (though not required)?
FWIW, TP5.5 doesn't allow them, BP7.0 does...
-- 
Frank Heckenbach, Erlangen, Germany
heckenb@mi.uni-erlangen.de
Turbo Pascal:   http://www.mi.uni-erlangen.de/~heckenb/programs.htm
Internet links: http://www.mi.uni-erlangen.de/~heckenb/links.htm


Frank Heckenbach (heckenb@mi.uni-erlangen.de)

HTML conversion by Lluís de Yzaguirre i Maura
Institut de Lingüística Aplicada - Universitat "Pompeu Fabra"
e-mail: de_yza@upf.es