On 12/07/07 00:49 +0200, Peter Stuge wrote:
Really good ideas! But some comments on implementation..
On Wed, Jul 11, 2007 at 12:15:17PM -0600, Jordan Crouse wrote:
+struct lar {
- int fd;
- unsigned char *map;
- int size;
+};
u32 size maybe?
I guess, though I'll bet this code doesn't survive long enough to see those 2G ROM chips.. :)
+static int lar_read_size(struct lar *lar, int filelen) +{
- unsigned int *ptr = (unsigned int *) (lar->map + (filelen - 12));
- return ptr[0];
+}
What about all this pointer arithmetic? Is it really correct?
Doesn't adding 1 to a pointer actually add sizeof pointed-to-type to the address? Ie. this code isn't portable?
lar->map is a uchar, which is a size of 1, so the math works. I think by convention, sizeof(uchar) is 1 everywhere, so there shouldn't be a portability issue here. If there is, then we'll have to do some very ugly casting in about 30 places in the code, and I'm hoping it doesn't come to that.
+static void annotate_bootblock(unsigned char *ptr, unsigned int size) +{
- int i;
- unsigned int *p;
- for(i = 13; i > 0; i--)
ptr[BOOTBLOCK_SIZE - i] = '\0';
I think a memset() call here would be nicer..
Oops - I copied that without thinking. Yeah - memset best.
- p = (unsigned int *) (ptr + BOOTBLOCK_SIZE - 12);
- p[0] = size;
+}
Then there's the matter of the pointer arithmetic again. Since ptr is uchar * this will work, but then writing size to p[0] will write a different number of bytes on different archs, right?
sizeof(unsigned int) is 4 on all the platforms we care about. I guess I could use u32 instead - but thats just going to end up decoding back down to unsigned int. If we used an unsigned long, then we would be in trouble, but I've made very sure we didn't do something like that.
- err:
- if (lar->fd >= 0)
close(lar->fd);
- unlink(archive);
- if (lar)
free(lar);
If lar can be 0 at err: then the fd deref is a segfault waiting to happen.
Lar can't be 0 at err - failure to malloc lar: on line 132 will immediately return. err: is there specifically to cleanup lar->fd.
+static void _close_lar(struct lar *lar) +{
- munmap(lar->map, lar->size);
- close(lar->fd);
- free(lar);
+}
Maybe add if(!lar) return; ? Maybe even complain with a warning.
Thats a good idea.
- /* Make a dummy bootblock */
- if (lar_add_bootblock(lar, NULL)) {
_close_lar(lar);
return NULL;
- }
This will leave a half-baked lar file hanging if adding the bootblock fails. Perhaps unlink() too in case of error?
Hmm - sure.
+struct lar * lar_open_archive(const char *archive) +{
- struct lar *lar;
- int ret, romlen;
- struct stat s;
- ret = stat(archive, &s);
- if (ret == -1) {
err("Unable to stat %s\n", archive);
return NULL;
- }
- lar = _open_lar(archive, s.st_size, O_RDWR);
Race conditions. First open() then fstat() to get the size.
Okay, I agree. It will be a little bit more complex, but its probably better.
Also, the file size can change even though we've opened the file.
Since the size is a rather important variable throughout lar we want to be able to handle it changing under our feet, but I don't know what the best way would be?
Do we care? We're not designing LAR to be able to handle concurent processes changing the file at the same time - if it happens, it will be just pure, unmigitaged bad luck - same as if you happen to change a file while you 'cat' or 'dd' it. Is this something thats realistic enough to worry about?
+static int file_in_list(struct file *files, char *filename) +{
- struct file *p;
- if (files == NULL)
return 1;
Shouldn't this if just be removed so that a NULL list falls through the for and returns 0?
No - it should return a 1, because by design, files==NULL means "show all files".
ptr += (ntohl(header->len) + ntohl(header->offset) - 1)
& 0xfffffff0;
We want this piece of code in a single place before it's in 1000 places. Could you make a function of it?
Sure.
+static int _write_file(char *filename, unsigned char *buffer, int len) +{
- char *path = strdup(filename);
- int fd, ret;
- if (path == NULL) {
err("Out of memory.\n");
return -1;
- }
- mkdirp((const char *) dirname(path), 0755);
- free(path);
- fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC, 0644);
- if (fd == -1) {
err("Error creating file %s\n", filename);
return -1;
- }
I don't think lar should mkdir -p implicitly. Just open() and perror() or print strerror(errno) in case of failure.
This came from the original code - I think that Stefan and Patrick were going for the same behavior as tar - and tar definately does implicitly use mkdir -p on extract.
- ret = write(fd, buffer, len);
- if (ret != len)
err("Error writingthe file %s\n", filename);
Again perror() or strerror(errno) would be nice. Goes for the entire patch of course. :)
Most of these were pretty quick and dirty. I was thinking while I was writing this that we should go through and standardize all the error and warning messages. This would be one of those things to do.
if (strncmp(header->magic, MAGIC, 8))
break;
Is strncmp() good for MAGIC ? Maybe memcmp() ?
Yes - memcmp is better.
if (ntohl(header->compression) == none) {
This ntohl() everywhere is tedious and a bit error prone. Perhaps it would be better to convert all data to machine byte order only when it's read rather than every time it's used?
The file is mmaped, so thats really impossible. It is tedious, but its the only way to ensure compatability with big endian machines.
- if (strstr(name, "nocompress:") == name) {
filename += 11;
thisalgo = none;
- }
strncmp() ?
Yeah - I guess either one would be ok. Should go with strncmp - its probably faster.
- if (filename[0] == '.' && filename[1] == '/')
filename += 2;
realpath() ?
I don't know if realpath is the right one here - because it returns the absolute path, which I don't think is the goal here. I tried to squeeze the filename : pathname stuff in here without making too many changes - perhaps we should revisit that.
- ret = stat (filename, &s);
- if (ret) {
err("Unable to stat %s\n", filename);
return -1;
- }
- /* Allocate a temporary buffer to compress into - this is unavoidable,
because we need to make sure that the compressed data will fit in
the LAR, and we won't know the size of the compressed data until
we actually compress it */
- temp = calloc(s.st_size, 1);
- if (temp == NULL) {
err("Out of memory.\n");
return -1;
- }
- /* Open the file */
- fd = open(filename, O_RDONLY);
Race again.
Got it.
- /* Do the compression step */
- compress_functions[thisalgo](ptr, s.st_size, temp, &complen);
- if (complen >= s.st_size && (thisalgo != none)) {
thisalgo = none;
compress_functions[thisalgo](ptr, s.st_size, temp, &complen);
- }
?! Is the compressor function required even with algo none?
I think it does it to avoid complex logic - but we can look at this again - there's probably cleanup that can happen here.
Thanks for your comments. Jordan