
Read a part of massive online CSV file using its URL from command line
It may be not just the beginning of the file that you need to read:
Use HTTP Content-Range
To read other parts than the first, you could use a HTTP request for a range, for example with the option of , to get a part of the file from some byte position to another:
Regarding the file format CSV - you would guess which part could be useful, get that part, and edit it manually to remove partial lines of data.
For example, to take a look at the 500 bytes/characters starting from byte 50000:
Note how the first and last line are cut off, as the actual part of the file was "just some text" of the file, not "some CSV records".
When Range is not supported
Common web servers support "Range" in general,
but it's possible that for some reason "Range" is not working
(eg. it's not supported by a custom server, or it does not work because there is a proxy in between)
In this case, we can not avoid to download the data before the part we're interested in. But then, we can cut out the part we need, by bytes, or by lines:
Just the same!
Read by line
To get lines, instead of characters, use instead of .
For lines 100 to 110 ("the last 10 lines of the lines up to 110"):
Now, as CSV records are lines, we have a clean start and end of the section.
The HTTP Content-Range does not support using line ranges; The HTTP server would need to read the whole file, including the part before the range, to count line numbers.
0 thoughts to “Linux download csv file”