Jump to content
We've recently updated our Privacy Statement, available here ×

Question: Flow vs. Iterate and dealing with Files


runger

Recommended Posts

Hi!

I'm trying to create a "data migration" type job to move the data from one (old, ugly) CMS system to another (new, shiny) CMS system.

The job is complicated - the CMS (Content Management System) data is heterogenous (XML, relational db-tables, binary files like images, and text files). Parsing the data involves reading files and db-data based on the content of other files and db-data.

Eg: a typical task would be something like: read the config file for website X to discover the filesystem locations and database IDs of the data related to website X. Then read data from the database tables based on these IDs, and based on this DB-data, move other files to new locations, etc...

Sounds like a good task for JasperETL, right?

--> As it turns out, I am having some trouble realizing these tasks using the built-in components.

In particular, switching between flow (required for DB access) and iterate (for accessing files) within the jobs seems to be causing troubles. File components can't take flow links, the iterate links don't transport the data associated with the previous flow, etc...

Many of my tasks seem to involve reading rows from a file, and then augmenting those rows with data from other files - is there a better way of doing this than the solution I have found (illustrated below)? (this is a simple example - imagine extending it to deal with rejected flows, more input files, additional DB input, etc...)

/uploads/projects/jasperetl/image/example.jpg

I guess what I am looking for is a way to merge data from a file into an existing flow, where the filename to read from is given by the existing flow's rows...

Any Ideas?


More generally, the question that forms in my head is: why the difference between flow an iterate? Why can a tFileList component not produce a flow, say with a fixed schema including filename and filepath? Why can a tFileInputRegex not take a flow as input, with the associated logic being that the FileInputRegex is invoked once per incoming flow element?


Thanks a lot for your attention,

Regards from Vienna,

Richard Unger

Link to comment
Share on other sites

  • Replies 0
  • Created
  • Last Reply

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×
×
  • Create New...