:: Re: [DNG] [OT] [Re: Studying C as t…
Forside
Slet denne besked
Besvar denne besked
Skribent: Edward Bartolo
Dato:  
Til: irrwahn35
CC: dng
Emne: Re: [DNG] [OT] [Re: Studying C as told. (For help)
Hi,

Irrwhan35 wrote:
<<
[...]
>     if ((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) {

[...]
You should *never* assume that the latin letters occur in
the execution character set in ascending consecutive order.
(Though similar *is* guaranteed for the digits '0' to '9'!)
>>


I think a function like the following can safely be used to avoid
situations where a character set ordering diverts from what I assumed.

const char alphabet[] =
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ\0";

int isletter(char ch) {
int p;

for (p = 0; ch != alphabet[p] && alphabet[p] != 0; ++p);
return (p < 26*2);
}

And
<< if ((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) >>

would reduce to:

if (isletter(c))

Hopefully, I did it right.

Edward




On 21/06/2016, Irrwahn <irrwahn@???> wrote:
> On Tue, 21 Jun 2016 16:00:53 +0200, Adam Borowski wrote:
>> On Tue, Jun 21, 2016 at 03:13:21PM +0200, Irrwahn wrote:
>>> On Tue, 21 Jun 2016 14:42:46 +0200, Edward Bartolo wrote:
>>> [...]
>>>>     if ((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) {
>>> [...]
>>> You should *never* assume that the latin letters occur in
>>> the execution character set in ascending consecutive order.
>>> (Though similar *is* guaranteed for the digits '0' to '9'!)

>>
>> Not really -- assuming ASCII is like assuming 8-bit bytes[1]. Both could
>> be
>> false at the dawn of time, but today trying to support that is a waste of
>> time. Too much code you rely on makes that assumption.
>
> Neither is an excuse for not using isalpha() et al. instead
> of above abomination. I'm not even going to mention the notion
> of "character representing an alphabetic letter" differs between
> locales. WTF, I just did, nevermind!
>
> And "there's so much broken code already you rely on" should
> never be an excuse to deliberately produce even more broken code.
>
>> It's called
>> "standardization".
>
> Oh, what times! Oh, what standards! It is not covered by the
> C standard, and I'm not going to make any guesses (however
> educated) on what past, present, or future platforms my code
> is going to be run. Or, *if* I did nonetheless, I'd provide a
> compile time test to prevent mishaps. YMMV.
>
>> Really, I'd say it's becoming reasonable to assume Unicode.
>
> I wouldn't necessarily introduce it in chapter 1 in a beginners
> book on C, though.
>
>> [1]. Byte != char, some special purpose processors even today have 32-bit
>> chars (ie, are unable to address individual bytes without bit twiddling)
>> which is legal according to the C standard.
>
> IOW: Those processors (that cannot directly address units < 32 bit
> in width) sport 32-bit bytes! Processors have no notion of 'char',
> and in C a char *is* a byte *is* a char. Always, non-negotiable.
>
> Woof!
> Urban
>
> _______________________________________________
> Dng mailing list
> Dng@???
> https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/dng
>