:: Re: [DNG] netman GIT project
Top Page
Delete this message
Reply to this message
Author: Irrwahn
Date:  
To: dng
Subject: Re: [DNG] netman GIT project
On Tue, 25 Aug 2015 15:24:01 +0100, Rainer Weikusat wrote:
> Irrwahn <irrwahn@???> writes:

<snip>
>> It is totally sensible to break down the character set to something that
>> is more or less guaranteed to be valid for building names in any file
>> system currently in use on this planet.
>
> This targets Linux with no chance of (and no intention to be) portable
> to anything else as it's basically a wrapper around a bunch of Linux
> (and even Debian) commands. There are exactly two bytes/ chars which must not
> appear in a filename under these circumstances, '\0' and '/'. Anything
> else is valid. Encoding other non-printable characters makes some sense
> in case these files are intended to be perused by humans, however, if
> this is not intended, it can as well be skipped as software doesn't need
> to 'look' at graphemes to distinguish byte sequences. Blindly extending
> this to "anything with bit 8 set" means it will replace all non-ASCII
> characters with "something completely unintelligible to humans" despite
> the machine still doesn't care. And that's not only the odd 'national
> characters' which appear in Western European alphabets but potentially,
> completely independent ones like Greek of Cyrillic.


Well, the POSIX "Fully portable filenames" lists these characters:
A–Z a–z 0–9 . _ -
But you are probably right that we can safely assume modern, sane,
unicode aware, "linuxy" filesystems for the case at hand. That leaves
us with two reasonable approaches:

A) Only escape '/' and '\0' (and the escape character, of course) and
leave the rest as-is, making for plain human readable filenames, or

B) Hex (or even base64 :P) encode the whole shebang and not give a hoot
about readability.

Personally, I'd opt for A. And while at it, treat an SSID as what it is:
"a sequence of 0 to 32 octets", e.g. pass it around as a buffer with
associated length information, just as a wireless station does.

I'd go and have a stab at it, if only for the fun of it, but that would
affect the backend incantations [*]. And since I wrote my last line of
Pascal in 1987, I'd need Edward's consent and assistance.

[*] Best way would probably be to escape the SSID in the frontend, pass
the encoded "string" and only decode in the backend where actually needed.
That way at least the argument handling wouldn't have to change much.

--
Irrwahn