Jump to content
Developer Wiki and Function Reference Links ×

Searching for Data in an external TXT file


Recommended Posts

Hi,

 

I'm hoping to use an external tab delimited TXT to store a master price list for use with our CAD drawings.  The file contains over 12,000 item codes which means I can't load it into a worksheet in VectorWorks as the row limit on those is 4094.  I've proven I can open and read a line from the file, but then I wanted to test how quickly VectorWorks could scan through the rows, so I simply put in a while loop in with an incremental counter variable.  At the file of the while loop (based on EOF), the loop should exit and give me the number of lines read (loops done in effect) by displaying the counter variable in a dialog (using AlrtDialog).

 

I set this running and waited (a long time), but the dialog didn't get displayed and interface was locked out so I assume the script was still running.  2 possiblities: 1) the loop has a fault causing it to loop indefinitely or 2) VectorWorks is really slow at reading lines from a file.  I don't it's option 1 as I've checked the loop code and am pretty sure it's fine.

 

I wondered if anyone could tell me if there are limits to the number of lines or file size that can be read with VectorScript and/or is VectorScript just really slow (in which case pretty useless) at reading files?

 

Thanks

Russell

Link to comment

I strongly recommend switching to python for file access. File reading and data parsing are native to the language, while the VS routines are proprietary and less flexible. (Not to mention, numerous python examples and tutorials exist on the web)

 

https://www.guru99.com/reading-and-writing-files-in-python.html

 

You should be able to read the file and load data into arrays or objects with a couple of lines.

Edited by JBenghiat
  • Like 1
Link to comment

Russell,

   I just finished a script in Python today that imports a 220 MB text file into a LIST (5,321,708 lines) - one text line per list element. I was able to do some serious parsing of the data (extracting names, searching for complex patterns, combining lines, and marking most for deletion), then write it out and back in to repeat the process a dozen times, all in under 35 seconds. This is not possible in VectorScript (VS), as the largest array in VS is limited to 32K elements. You're 12K item codes would fit into a VS array, but you'll have to code most of the routines you'll need yourself. Chances are, what you want to do is already available in Python as a set of canned routines. 

 

   For many things, VS and Python are nearly equal in speed. But if you're going to do a lot of string manipulation, Python has way more text handling features than Pascal. It's worth the effort to explore the language. Also, there is a seemingly unending wealth of help sites and fora on the web for Python. Let Google lead you to the answers you seek.

 

My 2¢,

Raymond

 

 

Link to comment
  • 9 months later...

In the case you are working with Big Data using readlines() is not very efficient as it can result in MemoryError because this function loads the entire file into memory, then iterates over it.

 

A slightly better approach for large files is to use the fileinput module , as follows:


 

import fileinput
for line in fileinput.input(['sample.txt']):
    print(line)

 

The fileinput.input() call reads lines sequentially, but doesn't keep them in memory after they've been read or even simply so this, since file in Python is iterable.

Link to comment
  • 3 weeks later...
On 5/6/2020 at 7:31 AM, warrenfelsh said:

In the case you are working with Big Data using readlines() is not very efficient as it can result in MemoryError because this function loads the entire file into memory, then iterates over it.

 

A slightly better approach for large files is to use the fileinput module , as follows:


 


import fileinput
for line in fileinput.input(['sample.txt']):
    print(line)

 

The fileinput.input() call reads lines sequentially, but doesn't keep them in memory after they've been read or even simply so this, since file in Python is iterable.

 

The Python 3.5 documentation does not recommend using the module fileinput for just one file: https://docs.python.org/3.5/library/fileinput.html

 

Instead, https://docs.python.org/3.5/library/io.html (paragraph for readlines) recommends using:

for line in file:

 

So the code sample above would then read something like:

for line in open('sample.txt', mode='r'):
    print(line)


 

Edited by Nicolas Goutte
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...