It really depends on the machine that is running the code. Pandas will always have the entire thing loaded in memory, and while 600Mb is not a concern for our modern laptops running a single analysis at a time, it can get really messy if the person is not thinking about hardware limitations
Then I guess that the meme doesn’t apply anymore. Though I will state that (from my anedoctal experience) people that can use Panda’s most advanced features* are also comfortable with other data processing frameworks (usually more suitable to large datasets**)
*Anything beyond the standard groupby - apply can be considered advanced, from the placrs I’ve been
**I feel the urge to note that 60Mb isn’ lt a large dataset by any means, but I believe that’s beyond the point
It really depends on the machine that is running the code. Pandas will always have the entire thing loaded in memory, and while 600Mb is not a concern for our modern laptops running a single analysis at a time, it can get really messy if the person is not thinking about hardware limitations
Pandas supports lazy loading and can read files in chunks. Hell, even regular ole Python doesn’t need to read the whole file at once with
csvI didn’t know about lazy loading, that’s cool!
Then I guess that the meme doesn’t apply anymore. Though I will state that (from my anedoctal experience) people that can use Panda’s most advanced features* are also comfortable with other data processing frameworks (usually more suitable to large datasets**)
*Anything beyond the standard
groupby-applycan be considered advanced, from the placrs I’ve been**I feel the urge to note that 60Mb isn’ lt a large dataset by any means, but I believe that’s beyond the point