Modifying string length?_(re)

Thu, 1 May 1997 22:43:22 +0200


 
The African Chief wrote:

> >But how would you translate references to s[0] in BP then? I know, there
> >shouldn't be any in most programs, but sometimes it can be useful, e.g. for
> >efficiency. But it's ok for me if it's only enabled with {$X+}, see below.
> 
> "s[0] := ..." is okay.

But you couldn't do that with long strings since their length isn't stored
in [0], but in .length. So there are cases where write access to .length is
needed, that's what I meant.

> I do that myself sometimes (but only to set the length to zero).

s:='' should do the same. (BP optmizes this to just setting the length to 0,
I hope gpc does as well.)

>         "length(s) := ...."
> or        "s.length := ..."
> 
> - which shouldn't compile (if at all), when using "--borland-pascal".

I think it should (at least with {$X+} or such), because it's the only way
to translate some kinds of procedures -- unless one wants to limit such
procedures to borland-like short strings.
 
> >Does this mean strings will always have an additional trailing #0? Do the
> >string routines currently support this (i.e. automatically add a #0 at the
> >end; but they should not recognize #0 as a terminator, but only rely on the
> >length field)? That would be good, because it simplifies converting a string
> >into a CString a lot (just "CString(@s[1])", right?).
> 
> Delphi 2 does this with long strings (I think). If we are going to do it,
> perhaps it should only be done for long strings as well?
> 
> However, what  happens when you want to add a string to
> the end of it? Check this example, and look at the ouput
> under Borland;
> 
> program Fred;
> var
> s,s1:string[40];
> begin
>  s  := 'Fred Smith'+#0;
>  s1 := s + 'is okay'; 
>  writeln ( s1 ); 
> {prints "Fred Smith is okay" - where did the space before "okay" 
> come from? I didn't want or put it there - viz; problem with trailing "#0"}
> end.

As Peter has already pointed out, an implicit #0 would not behave like this.
I suppose this would be achieved by having the length not include the #0.
So the internal representation of 'foo' would be something like:

Long string:  Capacity:...; Length: 3; string: ('f','o','o',#0,...)
Short string: (#3,'f','o','o',#0,...)

The question is, should the #0 be added for short strings as well?
If Length=Capacity, it's impossible, but otherwise???

BTW: Your example raises the "problem" that if you have a string with one or
more #0 in it, and convert it to a CString, the CString will be shorter than
expected. But I think there's nothing to be done about it. If one wants to
convert a string to a CString, one just must not put any #0 in it...

Peter Gerwinski wrote:

> > But how would you translate references to s[0] in BP then? I know, there
> > shouldn't be any in most programs, but sometimes it can be useful, e.g. for
> > efficiency. But it's ok for me if it's only enabled with {$X+}, see below.
> 
> That's one reason why `StortString's should be implemented - which will
> have "s[0]".  But this job doesn't have high priority for me ...

As I said above, it would (currently) suffice for me if "s[0]:=..." could be
translated to "s.CurrentLength:=..." for long strings. But at the moment, it
can't be translated at all. The BP syntax "s[0]" is, IMHO, not very good, so
it should not be supported (emulated) for long strings.

> > Sounds good! {$X} is a local switch, isn't it (so one can turn it on and off
> > around code that needs it, and be safe from accidents elsewhere). It would be
> > even better than BP (where accidental writes to s[0] are not even detected
> > by range checking). :-)
> 
> As long as GPC has no range checking at all ...

I know...

Currently the length could probably be changed by writing to
s[1-SizeOf(Integer)] .. s[0] -- but I really don't want to try that... :-|

BTW: Does the standard require range checking (or does it say anything about
it at all)?

> "CurrentLength"?

Why not.
 
> What about this:  With (*$X+*), addresses of packed array members are
> only rejected if they don't lie on a byte boundary, and otherwise they
> work?

And with {$X-} they don't work at all, you mean? Seems ok.
And the same for packed records?
 
> Another idea, perhaps even better:  let "String ( 255 )" or
> "LongString ( 255 )" denote an Extended Pascal (long) string, but 
> "String [ 255 ]" or "ShortString [ 255 ]" an UCSD Pascal (short)
> string?  Like this, both Extended Pascal and Borland Pascal programs
> will just compile.  Needless to say that `--extended-pascal' will
> switch off "String [ 255 ]" and `--borland-pascal' will switch off
> "String ( 255 )" ...

Isn't it good when non-standard compilers (BP) use non-standard syntax? :-)

BTW: String[n] with n>255 wouldn't work at all then? Seems ok to me.
-- 
Frank Heckenbach, Erlangen, Germany
heckenb@mi.uni-erlangen.de
Turbo Pascal:   http://www.mi.uni-erlangen.de/~heckenb/programs.htm
Internet links: http://www.mi.uni-erlangen.de/~heckenb/links.htm


Frank Heckenbach (heckenb@mi.uni-erlangen.de)

HTML conversion by Lluís de Yzaguirre i Maura
Institut de Lingüística Aplicada - Universitat "Pompeu Fabra"
e-mail: de_yza@upf.es