Russell Bird Posted July 26, 2019 Share Posted July 26, 2019 Hi, I'm hoping to use an external tab delimited TXT to store a master price list for use with our CAD drawings. The file contains over 12,000 item codes which means I can't load it into a worksheet in VectorWorks as the row limit on those is 4094. I've proven I can open and read a line from the file, but then I wanted to test how quickly VectorWorks could scan through the rows, so I simply put in a while loop in with an incremental counter variable. At the file of the while loop (based on EOF), the loop should exit and give me the number of lines read (loops done in effect) by displaying the counter variable in a dialog (using AlrtDialog). I set this running and waited (a long time), but the dialog didn't get displayed and interface was locked out so I assume the script was still running. 2 possiblities: 1) the loop has a fault causing it to loop indefinitely or 2) VectorWorks is really slow at reading lines from a file. I don't it's option 1 as I've checked the loop code and am pretty sure it's fine. I wondered if anyone could tell me if there are limits to the number of lines or file size that can be read with VectorScript and/or is VectorScript just really slow (in which case pretty useless) at reading files? Thanks Russell Quote Link to comment
JBenghiat Posted July 26, 2019 Share Posted July 26, 2019 (edited) I strongly recommend switching to python for file access. File reading and data parsing are native to the language, while the VS routines are proprietary and less flexible. (Not to mention, numerous python examples and tutorials exist on the web) https://www.guru99.com/reading-and-writing-files-in-python.html You should be able to read the file and load data into arrays or objects with a couple of lines. Edited July 26, 2019 by JBenghiat 1 Quote Link to comment
Peter Vandewalle Posted July 26, 2019 Share Posted July 26, 2019 I agree on the Python idea. And I would suggest using an xml dike instead of a txt file. XML data can be accessed by address instead of by looping. Quote Link to comment
LarryO Posted July 27, 2019 Share Posted July 27, 2019 if you are in an infinite loop. This could be as simple as checking for NULL data from your read line request to determine EOF. I don't believe that files embed a readable ascii 4 character because EOF is a control state by definition. Quote Link to comment
MullinRJ Posted July 29, 2019 Share Posted July 29, 2019 Russell, I just finished a script in Python today that imports a 220 MB text file into a LIST (5,321,708 lines) - one text line per list element. I was able to do some serious parsing of the data (extracting names, searching for complex patterns, combining lines, and marking most for deletion), then write it out and back in to repeat the process a dozen times, all in under 35 seconds. This is not possible in VectorScript (VS), as the largest array in VS is limited to 32K elements. You're 12K item codes would fit into a VS array, but you'll have to code most of the routines you'll need yourself. Chances are, what you want to do is already available in Python as a set of canned routines. For many things, VS and Python are nearly equal in speed. But if you're going to do a lot of string manipulation, Python has way more text handling features than Pascal. It's worth the effort to explore the language. Also, there is a seemingly unending wealth of help sites and fora on the web for Python. Let Google lead you to the answers you seek. My 2¢, Raymond Quote Link to comment
Nicolas Goutte Posted July 29, 2019 Share Posted July 29, 2019 Indeed, that is what I would recommend too: use Python. VS was not meant for handling large files. You could even use the csv module, if your data is organized in columns (be it comma separated, separated by tabs or any similar tabular format that the csv module supports). Quote Link to comment
warrenfelsh Posted May 6, 2020 Share Posted May 6, 2020 In the case you are working with Big Data using readlines() is not very efficient as it can result in MemoryError because this function loads the entire file into memory, then iterates over it. A slightly better approach for large files is to use the fileinput module , as follows: import fileinput for line in fileinput.input(['sample.txt']): print(line) The fileinput.input() call reads lines sequentially, but doesn't keep them in memory after they've been read or even simply so this, since file in Python is iterable. Quote Link to comment
Nicolas Goutte Posted May 25, 2020 Share Posted May 25, 2020 (edited) On 5/6/2020 at 7:31 AM, warrenfelsh said: In the case you are working with Big Data using readlines() is not very efficient as it can result in MemoryError because this function loads the entire file into memory, then iterates over it. A slightly better approach for large files is to use the fileinput module , as follows: import fileinput for line in fileinput.input(['sample.txt']): print(line) The fileinput.input() call reads lines sequentially, but doesn't keep them in memory after they've been read or even simply so this, since file in Python is iterable. The Python 3.5 documentation does not recommend using the module fileinput for just one file: https://docs.python.org/3.5/library/fileinput.html Instead, https://docs.python.org/3.5/library/io.html (paragraph for readlines) recommends using: for line in file: So the code sample above would then read something like: for line in open('sample.txt', mode='r'): print(line) Edited May 25, 2020 by Nicolas Goutte Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.