On Wed, Jul 11, 2007 at 05:38:41PM -0600, Jordan Crouse wrote:
- int size;
u32 size maybe?
I guess, though I'll bet this code doesn't survive long enough to see those 2G ROM chips.. :)
I won't take that bet.
But it is nice to show that we have thought about it by being explicit. Eliminating potential portability problems is another plus.
- unsigned int *ptr = (unsigned int *) (lar->map + (filelen - 12));
What about all this pointer arithmetic? Is it really correct?
lar->map is a uchar, which is a size of 1, so the math works.
Yes, that's right. Sorry about the noise.
I think by convention, sizeof(uchar) is 1 everywhere, so there shouldn't be a portability issue here. If there is, then we'll have to do some very ugly casting in about 30 places in the code, and I'm hoping it doesn't come to that.
The alternative is u8 *map. Copy u32 size reasoning but much less important. No. Never mind.
- p = (unsigned int *) (ptr + BOOTBLOCK_SIZE - 12);
- p[0] = size;
+}
Then there's the matter of the pointer arithmetic again. Since ptr is uchar * this will work, but then writing size to p[0] will write a different number of bytes on different archs, right?
sizeof(unsigned int) is 4 on all the platforms we care about.
640k will be enough. ;)
I guess I could use u32 instead -
Please do. Or do I need to be hit over the head hard with a C cluestick?
but thats just going to end up decoding back down to unsigned int.
Fine.
If we used an unsigned long, then we would be in trouble, but I've made very sure we didn't do something like that.
Yeah, that's good. But is it good enough?
- err:
- if (lar->fd >= 0)
close(lar->fd);
- unlink(archive);
- if (lar)
free(lar);
If lar can be 0 at err: then the fd deref is a segfault waiting to happen.
Lar can't be 0 at err
Then the if (lar) free(lar); should change IMHO. It's confusing.
I'd like this though:
if (lar) { if (lar->fd >= 0) close(lar->fd); free(lar); } unlink(archive);
Is lar->fd initialized to -1 btw?
Race conditions. First open() then fstat() to get the size.
Okay, I agree. It will be a little bit more complex, but its probably better.
Make the size be a & argument to _open_lar() instead and hide the fstat() in there.
Also, the file size can change even though we've opened the file.
Since the size is a rather important variable throughout lar we want to be able to handle it changing under our feet, but I don't know what the best way would be?
Do we care? We're not designing LAR to be able to handle concurent processes changing the file at the same time - if it happens, it will be just pure, unmigitaged bad luck - same as if you happen to change a file while you 'cat' or 'dd' it. Is this something thats realistic enough to worry about?
It depends on the rest of the code. As long as lar does not go crazy if a larball changes in size while running we don't care, but some calculations do use that total size..
The file can be corrupt, I just want to avoid making it 4G-256k.
I would be happy if lar exits with an error if the file size has changed between _open_lar() and _close_lar().
+static int file_in_list(struct file *files, char *filename) +{
- struct file *p;
- if (files == NULL)
return 1;
Shouldn't this if just be removed so that a NULL list falls through the for and returns 0?
No - it should return a 1, because by design, files==NULL means "show all files".
That seems sort of unintuitive for a function named file_in_list?
I'd prefer having that negation in the caller. Or at the very least doxygen comment it.
ptr += (ntohl(header->len) + ntohl(header->offset) - 1)
& 0xfffffff0;
We want this piece of code in a single place before it's in 1000 places. Could you make a function of it?
Sure.
Also note that ntohl() already uses uint32_t - another reason for us to use all u32.
Again perror() or strerror(errno) would be nice. Goes for the entire patch of course. :)
Most of these were pretty quick and dirty. I was thinking while I was writing this that we should go through and standardize all the error and warning messages. This would be one of those things to do.
Sure, separate patch for this some other time would be great. I figured I'd mention it since this was all new code though.
if (ntohl(header->compression) == none) {
This ntohl() everywhere is tedious and a bit error prone. Perhaps it would be better to convert all data to machine byte order only when it's read rather than every time it's used?
The file is mmaped, so thats really impossible. It is tedious, but its the only way to ensure compatability with big endian machines.
We can abstract the header. I am generally strongly against unneccessary abstraction but in this case I think it's warranted. The code will be MUCH more readable and writable without all the ntohl().
Either have get functions for the header fields or make a native copy of the header in the new function that finds the next valid file header. I like the latter.
- if (filename[0] == '.' && filename[1] == '/')
filename += 2;
realpath() ?
I don't know if realpath is the right one here - because it returns the absolute path, which I don't think is the goal here.
No, but cwd would be stripped away to get a canonicalized relative path. I can break the above code just by doing ././filename while a construct based on realpath() would clean anything up.
There's no reason for a person to write ././foo of course, but it could easily happen with a layer or two of scripting.
I tried to squeeze the filename : pathname stuff in here without making too many changes - perhaps we should revisit that.
I think so.
?! Is the compressor function required even with algo none?
I think it does it to avoid complex logic - but we can look at this again - there's probably cleanup that can happen here.
Not that complex.. if (algo != none) { do_compression_stuff; } :)
Thanks for your comments.
Welcome. Thanks for the patch! =)
//Peter