Modifying string length?_(re)
Thu, 1 May 1997 22:43:22 +0200
The African Chief wrote:
> >But how would you translate references to s[0] in BP then? I know, there
> >shouldn't be any in most programs, but sometimes it can be useful, e.g. for
> >efficiency. But it's ok for me if it's only enabled with {$X+}, see below.
>
> "s[0] := ..." is okay.
But you couldn't do that with long strings since their length isn't stored
in [0], but in .length. So there are cases where write access to .length is
needed, that's what I meant.
> I do that myself sometimes (but only to set the length to zero).
s:='' should do the same. (BP optmizes this to just setting the length to 0,
I hope gpc does as well.)
> "length(s) := ...."
> or "s.length := ..."
>
> - which shouldn't compile (if at all), when using "--borland-pascal".
I think it should (at least with {$X+} or such), because it's the only way
to translate some kinds of procedures -- unless one wants to limit such
procedures to borland-like short strings.
> >Does this mean strings will always have an additional trailing #0? Do the
> >string routines currently support this (i.e. automatically add a #0 at the
> >end; but they should not recognize #0 as a terminator, but only rely on the
> >length field)? That would be good, because it simplifies converting a string
> >into a CString a lot (just "CString(@s[1])", right?).
>
> Delphi 2 does this with long strings (I think). If we are going to do it,
> perhaps it should only be done for long strings as well?
>
> However, what happens when you want to add a string to
> the end of it? Check this example, and look at the ouput
> under Borland;
>
> program Fred;
> var
> s,s1:string[40];
> begin
> s := 'Fred Smith'+#0;
> s1 := s + 'is okay';
> writeln ( s1 );
> {prints "Fred Smith is okay" - where did the space before "okay"
> come from? I didn't want or put it there - viz; problem with trailing "#0"}
> end.
As Peter has already pointed out, an implicit #0 would not behave like this.
I suppose this would be achieved by having the length not include the #0.
So the internal representation of 'foo' would be something like:
Long string: Capacity:...; Length: 3; string: ('f','o','o',#0,...)
Short string: (#3,'f','o','o',#0,...)
The question is, should the #0 be added for short strings as well?
If Length=Capacity, it's impossible, but otherwise???
BTW: Your example raises the "problem" that if you have a string with one or
more #0 in it, and convert it to a CString, the CString will be shorter than
expected. But I think there's nothing to be done about it. If one wants to
convert a string to a CString, one just must not put any #0 in it...
Peter Gerwinski wrote:
> > But how would you translate references to s[0] in BP then? I know, there
> > shouldn't be any in most programs, but sometimes it can be useful, e.g. for
> > efficiency. But it's ok for me if it's only enabled with {$X+}, see below.
>
> That's one reason why `StortString's should be implemented - which will
> have "s[0]". But this job doesn't have high priority for me ...
As I said above, it would (currently) suffice for me if "s[0]:=..." could be
translated to "s.CurrentLength:=..." for long strings. But at the moment, it
can't be translated at all. The BP syntax "s[0]" is, IMHO, not very good, so
it should not be supported (emulated) for long strings.
> > Sounds good! {$X} is a local switch, isn't it (so one can turn it on and off
> > around code that needs it, and be safe from accidents elsewhere). It would be
> > even better than BP (where accidental writes to s[0] are not even detected
> > by range checking). :-)
>
> As long as GPC has no range checking at all ...
I know...
Currently the length could probably be changed by writing to
s[1-SizeOf(Integer)] .. s[0] -- but I really don't want to try that... :-|
BTW: Does the standard require range checking (or does it say anything about
it at all)?
> "CurrentLength"?
Why not.
> What about this: With (*$X+*), addresses of packed array members are
> only rejected if they don't lie on a byte boundary, and otherwise they
> work?
And with {$X-} they don't work at all, you mean? Seems ok.
And the same for packed records?
> Another idea, perhaps even better: let "String ( 255 )" or
> "LongString ( 255 )" denote an Extended Pascal (long) string, but
> "String [ 255 ]" or "ShortString [ 255 ]" an UCSD Pascal (short)
> string? Like this, both Extended Pascal and Borland Pascal programs
> will just compile. Needless to say that `--extended-pascal' will
> switch off "String [ 255 ]" and `--borland-pascal' will switch off
> "String ( 255 )" ...
Isn't it good when non-standard compilers (BP) use non-standard syntax? :-)
BTW: String[n] with n>255 wouldn't work at all then? Seems ok to me.
--
Frank Heckenbach, Erlangen, Germany
heckenb@mi.uni-erlangen.de
Turbo Pascal: http://www.mi.uni-erlangen.de/~heckenb/programs.htm
Internet links: http://www.mi.uni-erlangen.de/~heckenb/links.htm
Frank Heckenbach (heckenb@mi.uni-erlangen.de)
HTML conversion by Lluís de Yzaguirre i Maura
Institut de Lingüística Aplicada -
Universitat "Pompeu Fabra"
e-mail: de_yza@upf.es