Efficiently Exporting Large Data (millions of rows) in Laravel: Leveraging Queues and Jobs for PDF and CSV?Exports
I am currently developing a point-of-sale application, and one of the essential features I'm working on is the ability to export data from various pages within the system. These exports need to be available in either CSV or PDF format. In some cases, we're dealing with millions of records, which poses challenges when using Laravel's stream download, especially when we encounter certain conditions such as exporting all records or exceeding 5000 records.
Because of the enormous data volume, the download process breaks in the middle, preventing the complete data from being returned. I've also experimented with using a job for this, but it consumed excessive memory and caused the export to fail. Increasing the memory allocation for the Laravel job wasn't an acceptable solution for me. While it may work for some, I chose to take a different approach that demands less memory. Here's what I'm doing: I've created a job that continually re-dispatches itself to the queue until it successfully exports the data to the filesystem. And then users can download the file without any issues since it's already generated and users can send the file by email as well as it is already in the filesystem.
I successfully employed the Laravel League\Csv\Reader and laravel-mpdf packages to address the export and memory issues as well. where Laravel league\Csv package reader and writer function uses the standard csv method to read and write.
Now, let's take a look at some code. Here's a simplified version of the Job file:
<?php
namespace App\Jobs;
use App\Repositories\Contracts\OrderRepository;
use Carbon\Carbon;
use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Filesystem\FileNotFoundException;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;
use League\Csv\Exception;
use League\Csv\Reader;
use League\Csv\Writer;
use Mpdf\MpdfException;
use PDF;
use setasign\Fpdi\PdfParser\CrossReference\CrossReferenceException;
use setasign\Fpdi\PdfParser\PdfParserException;
use setasign\Fpdi\PdfParser\Type\PdfTypeException;
class CreateOrderExportFileJob implements ShouldQueue
{
use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;
public $export;
public $searchCriteria;
public $directoryName;
public $fileSystemAdapter;
public $csvFilePath;
public $header;
public $subject;
public $branchName;
/**
* Create a new job instance.
*
* @param GenericExport $export
* @param array $searchCriteria
*/
public function __construct(GenericExport $export, array &$searchCriteria = [])
{
$this->directoryName = 'export';
$this->export = $export;
$this->searchCriteria = &$searchCriteria;
$this->fileSystemAdapter = Storage::disk('public');
}
/**
* set the branch name for pdf header
*
* @return void
*/
public function setBranchName()
{
$this->branchName = 'All Branch';
if (isset($this->searchCriteria['branchId'])){
$branch = app(BranchRepository::class)->findOne($this->searchCriteria['branchId']);
$this->branchName = $branch ? $branch->name : 'N/A';
}
}
/**
* @return Mpdf
* @throws \Mpdf\MpdfException
*/
public function initializeMpdf(): Mpdf
{
return new Mpdf([
'mode' => 'utf-8',
'format' => 'A4',
'margin_top' => 42,
'margin_left' => 10,
'margin_right' => 10,
'margin_bottom' => 12,
'title' => $this->export->fileName . '.pdf'
]);
}
/**
* Execute the job.
*
* @return void
* @throws MpdfException
* @throws CrossReferenceException
* @throws PdfParserException
* @throws PdfTypeException|FileNotFoundException
* @throws Exception
*/
public function handle()
{
$items = $this->getOrders();
if($this->export->status !== 'processing') {
$this->export->update([
'status' => 'processing',
'fileName' => Carbon::now()->format('Y_m_d_H_i_s') . '_order_list',
'statusMessage' => "Job {$this->searchCriteria['page']} in export processing started"
]);
//csv file headers
$headers = [
'id',
'invoice',
'date',
'branchName',
'salePersonName',
'customerName',
'tax',
'shippingCost',
'discount',
'amount',
'due',
'paid',
'profitAmount',
'deliveryMethod',
'paymentMethods',
'paymentStatus',
'status'
];
// Create a new CsvWriter instance and specify the output file path and
$csvFilePath = $this->fileSystemAdapter->path($this->directoryName . '/' .$this->export->fileName . '.csv');
$csv = Writer::createFromPath($csvFilePath, 'w+');
$csv->insertOne($headers);
} elseif($this->export->status === 'processing') {
$this->export->update([
'statusMessage' => "Job {$this->searchCriteria['page']} in export processing started"
]);
}
$csvFilePath = $this->fileSystemAdapter->path($this->directoryName . '/' .$this->export->fileName . '.csv');
$csv = Writer::createFromPath($csvFilePath, 'a+');
$mappedData = $items->map(function ($item) use ($csv){
return [
'id' => $item->id,
'invoice' => $item->invoice,
'date' => $item->created_at,
'branchName' => $item->branch ? $item->branch->name : 'N/A',
'salePersonName' => $item->salePerson ? $item->salePerson->name : 'N/A',
'customerName' => $item->customer ? $item->customer->name : 'N/A',
'tax' => $item->tax,
'shippingCost' => $item->shippingCost,
'discount' => $item->discount,
'amount' => $item->amount,
'due' => $item->due,
'paid' => $item->paid,
'profitAmount' => $item->profitAmount,
'deliveryMethod' => $item->deliveryMethod,
'paymentMethods' => $item->paymentMethods(),
'paymentStatus' => $item->paymentStatus,
'status' => $item->status
];
});
$csv->insertAll($mappedData);
$this->searchCriteria['page'] = $items->hasMorePages() ? $items->currentPage() + 1 : null;
if (is_null($this->searchCriteria['page']) || (!is_null($this->searchCriteria['last_page']) && (int) $this->searchCriteria['page'] > (int) $this->searchCriteria['last_page'])) {
if($this->export->exportAs == 'pdf') {
self::setBranchName();
$csvFilePath = $this->fileSystemAdapter->path($this->directoryName . '/' .$this->export->fileName . '.csv');
$csv = Reader::createFromPath($csvFilePath);
$csv->setHeaderOffset(0); // Assuming the first row is the header
$dataChunks = array_chunk(iterator_to_array($csv->getRecords()), 500);
$pdf = self::initializeMpdf();
foreach ($dataChunks as $key => $chunkData) {
$pdf->SetTitle($this->export->fileName);
if($key != 0) {
$pdf->AddPage();
}
$pdf->WriteHTML(view($this->export->viewPath, [
'serial' => $key 500 == 0 ? 1 : $key 500,
'items' => $chunkData,
'branch' => $this->branchName
]));
}
self::putProcessedPdf($pdf);
}
$this->header = 'Attachment of Generated Order Lists';
$this->subject = 'Order List Generated';
self::sendEmailToUser();
return;
}
// Refresh to get the current state of export before using it for the next job
$this->export->refresh();
dispatch(new static($this->export, $this->searchCriteria));
}
/**
* @param $pdf
* @return void
*/
public function putProcessedPdf($pdf)
{
$fileContent = $pdf->Output( $this->export->fileName . '.pdf', 'S');
$this->fileSystemAdapter->put( $this->directoryName . '/' .$this->export->fileName . '.pdf', $fileContent, 'public');
//remove csv file after pdf file generation
$this->fileSystemAdapter->delete($this->directoryName . '/' .$this->export->fileName . '.csv');
}
/**
* @return void
*/
public function sendEmailToUser()
{
$this->export->createdByUser->notify(new SendGeneratedExportFile([
'file_name' => $this->export->fileName . '.' . $this->export->exportAs,
'directory_name' => $this->directoryName,
'export_as' => $this->export->exportAs,
'header' => $this->header,
'subject' => $this->subject,
'app_description' => 'Your Requested '. $this->subject .' Please Check Your E-mail',
'mail_description' => 'Your Requested '. $this->subject .' And Attached in This E-mail'
]));
// We are done processing
$this->export->update([
'status' => 'processed',
'statusMessage' => 'Export file processed successfully and sent to the user email',
]);
}
/**
* @return void
*/
public function failed() {
// This method is called if the job fails
// Remove the files generated by the jobs if the jobs failed
$filePath = $this->directoryName . '/' . $this->export->fileName;
if ($this->fileSystemAdapter->exists($filePath . '.csv')) {
$this->fileSystemAdapter->delete($filePath . '.csv');
}
if ($this->fileSystemAdapter->exists($filePath . '.pdf')) {
$this->fileSystemAdapter->delete($filePath . '.pdf');
}
}
/**
* @return mixed
*/
public function getOrders() {
return app(OrderRepository::class)->findBy(array_diff_key($this->searchCriteria, array_flip(['last_page', 'items'])));
}
}
To dispatch this job, I have created a route on the api.php file. and from the controller, I fired the job?
<?php?
use App\Http\Controllers\GenericExportController;?
use Illuminate\Support\Facades\Route;?
Route::group(['prefix' => 'export'], function () { Route::get('order-list/{as}', [GenericExportController::class, 'orderList']) ->where('as', 'pdf|csv'); });
the complete code implementation example is available on gist. you can follow the below link
领英推荐
I went with this approach because:
Hope this helps someone on the same track. Cheers
inspired from: