v.9.9.7 SQL Task output to CSV Performance on large Datasets - VisualCron - Forum

Community forum

Paul Humphris
2022-06-08T10:18:54Z
Using v9.9.7, SQL Task

I have a simple SQL Query which extracts approx 250k rows of data from a SQL Server database, which is ran directing the output to StdOut, completes in approx 10sec.

However, when I change the output to Comma separated, No text qualifier. CrLf line breaks and no headers and specify a CSV File path, the task takes in the order of 30mins to complete.

I have tried creating a Write Task which reads the SQL Task StdOut and writes to a file and this in completed in approx 2 secs. Also a powershell script which runs the SQL Query and outputs to a CSV file can complete in 10-15secs.

Can you please investigate the poor performance of the SQL Task when outputting large datasets to CSV files?
thomas
2022-06-08T14:32:05Z
Is the 'append to file' checked? You could try toggling it and see if it makes a difference. Maybe they append one line at the time..
Paul Humphris
2022-06-09T08:16:25Z
The Append to File option is unchecked as I require a new file to be created.
Michael Fjellström
2022-06-10T11:30:47Z
Originally Posted by: Paul Humphris 

The Append to File option is unchecked as I require a new file to be created.



What database are you using and connection setting? ODBC/OLEDB/Native?

Can you provide some sample we could use to reproduce this on our environment? Job+file. Send an email to support@visualcron.com please
Michael Fjellström
2022-06-10T13:15:18Z
Also how much data in bytes is this? In mb
Paul Humphris
2022-06-15T13:52:44Z
The data consists of 280326 rows of 18 columns from a SQL Server database (Native connection), resulting in a CSV file 31.5Mb in size.

The SQL Task is configured with a simple "select * from...." statement from a SQL Table.

The Output to file options are set to Comma field separator and CrLf line break.

The output path is using {JOB(Active|Variable|RootPath)}\Folder\File.csv to specify the output file.

Joey S
2022-06-16T20:12:05Z
I created a SQL job to run. Locally, on the SQL server it takes 40 seconds to run and the resulting CSV file is 30.5MB (size on disk is 29.7 MB)

Running the same query via VC job on a remote server takes 36.1 seconds when outputting to Standard Output. VC lists the output size as 27.25 MB. If I export that data, the size on disk is 26.6 MB.

If I run the VC job output to a file (local server where VC is running, not appending) the job takes 43.6 seconds (20% longer). The resulting file is identical in size to the exported version originating from the standard out.

I am in version VC v.9.9.6
SQL is Developer Edition v12.3.6024.0

Seems to be working ok. The difference I see between running the query on the SQL box and using VC is the date formatting (2022-06-16 vs 6/16/2022 on the VC server), time formats and when VC runs the query I see blank values where there are NULL values on the SQL server run. This would explain the difference in file sizes.

Here is my output setting
image.png
Paul Humphris
2022-06-17T09:20:30Z
It seems that this problem does not occur when writing to a local file, but only to a network UNC share.
Joey S
2022-06-17T15:30:32Z
Agree, running it again to a network share that requires a credential. At 3 min mark and the file is still 0kb although I can open the file and see it is 44k rows long

The file should be 175k rows so I would estimate it is 25% done

Something is weird with the buffer or whatever temp file is made to do this part of it. The obvious work around would be write it locally then move it with an additional task. In general I would never write across a network so two tasks would be "right" to me either way.
Michael Fjellström
2022-06-22T15:16:12Z
Originally Posted by: Joey S 

Agree, running it again to a network share that requires a credential. At 3 min mark and the file is still 0kb although I can open the file and see it is 44k rows long

The file should be 175k rows so I would estimate it is 25% done

Something is weird with the buffer or whatever temp file is made to do this part of it. The obvious work around would be write it locally then move it with an additional task. In general I would never write across a network so two tasks would be "right" to me either way.



When you say locally, do you mean that the VC server is on the same machine as the SQL server, or are they on different machines?
Joey S
2022-06-22T15:23:29Z
Locally would be where visualcron is running. All of my tests were with a "remote" sql server and a remote file destination server

Server 1 - SQL (remote, no destination files, no visualcron client/server)
Server 2 - Visualcron server - all jobs run here (local)
Server 3 - File server (remote, no VC client/server, no SQL)
Scroll to Top