box/spout使用记录

本文档介绍了 Spout 3.0 版本的更新内容及使用方法,包括 CSV、XLSX 和 ODS 文件的读写操作,并提供了样式设置、自定义临时文件夹等高级功能的示例。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

spout 升级命名3.0后变动较多,且官网挂了,记录下文档
包地址
https://packagist.org/packages/box/spout

github地址
https://github.com/box/spout

composer地址
composer require box/spout

读写

//包空间有变动 ,只保留了部分示例性代码
use Box\Spout\Reader\Common\Creator\ReaderEntityFactory;
use Box\Spout\Writer\Common\Creator\WriterEntityFactory;

	/**
     * @param $filename 文件名  string
     * @param $admin    用户名  string
     * @param $headerValues 头标题 []
     * @param $cellValues   表数据 [[],[]]
     * @return mixed
     */
    public function createCSV($filename,$admin,$headerValues,$cellValues)
    {
        list($savePath,$salPath) = $this->createFile($filename,$admin);
        $writer = WriterEntityFactory::createCSVWriter();
        $writer->openToFile($savePath);
        WriterEntityFactory::createWriterFromFile('.csv');
        //这个单元格有自己的样式
//        $cell1  = WriterEntityFactory::createCell('foowahhh',$cellStyle1);
//        $cell2  = WriterEntityFactory::createCell(12345444,$cellStyle2);
//        $row2   = WriterEntityFactory::createRow([$cell1,$cell2]);
        $header = WriterEntityFactory::createRowFromArray($headerValues);
        $writer->addRows([$header]);
        
        foreach($cellValuesas $value){
             $row = WriterEntityFactory::createRowFromArray($value);
             $writer->addRows([$row]);
         }
            $writer->close();
      
        return $salPath;
    }
     public function createFile($fileName,$admin_nickname)
    {
      	//返回相对路径和绝对路径
        return [$savePath,$salPath];
    }
    
     public function createXLSX()
    {
        $writer = WriterEntityFactory::createXLSXWriter();
    }
    
    public function createODS()
    {
        $writer = WriterEntityFactory::createODSWriter();
    }
    
    public function readCSV()
    {
        $reader = ReaderEntityFactory::createCSVReader();
    }
    
    public function readXLSX()
    {
        $reader = ReaderEntityFactory::createXLSXReader();
    }
    
    public function readODS()
    {
        $reader = ReaderEntityFactory::createODSReader(); 
    }

示例-样式-styles

文档 https://github.com/box/spout/tree/master/docs
//包空间有变动
use Box\Spout\Writer\Common\Creator\Style\StyleBuilder
use Box\Spout\Writer\Common\Creator\Style\BorderBuilder
use Box\Spout\Common\Entity\Style\Border
use Box\Spout\Common\Entity\Style\BorderPart
use Box\Spout\Common\Entity\Style\Color
use Box\Spout\Common\Entity\Style\Style

//支持字体,背景,边框以及对齐样式。
$cellStyle1 = (new StyleBuilder())
               ->setFontBold()
               ->setFontSize(15)
               ->setFontColor(Color::BLUE)
               ->setShouldWrapText()
               ->setBackgroundColor(Color::YELLOW)
               ->build();
               
//向行添加边框需要一个Border对象
$border = (new BorderBuilder())
    ->setBorderBottom(Color::GREEN, Border::WIDTH_THIN, Border::STYLE_DASHED)
    ->build();

$style = (new StyleBuilder())
    ->setBorder($border)
    ->build();


//Spout将为所有创建的行使用默认样式。可以通过以下方式覆盖此样式:
$defaultStyle = (new StyleBuilder())
                ->setFontName('Arial')
                ->setFontSize(11)
                ->build();


部分文档内容

CategoryPropertyAPI
FontBoldStyleBuilder::setFontBold()
ItalicStyleBuilder::setFontItalic()
UnderlineStyleBuilder::setFontUnderline()
StrikethroughStyleBuilder::setFontStrikethrough()
Font nameStyleBuilder::setFontName(‘Arial’)
Font sizeStyleBuilder::setFontSize(14)
Font colorStyleBuilder::setFontColor(Color::BLUE)
StyleBuilder::setFontColor(Color::rgb(0, 128, 255))
AlignmentWrap text`StyleBuilder::setShouldWrapText(true
New sheet creation

It is also possible to change the behavior of the writer when the maximum number of rows (1,048,576) have been written in the current sheet:

use Box\Spout\Writer\WriterFactory;
use Box\Spout\Common\Type;

$writer = WriterFactory::create(Type::ODS);
$writer->setShouldCreateNewSheetsAutomatically(true); // default value
$writer->setShouldCreateNewSheetsAutomatically(false); // will stop writing new data when limit is reached
Using custom temporary folder

Processing XLSX and ODS files require temporary files to be created. By default, Spout will use the system default temporary folder (as returned by sys_get_temp_dir()). It is possible to override this by explicitly setting it on the reader or writer:

use Box\Spout\Writer\WriterFactory;
use Box\Spout\Common\Type;

$writer = WriterFactory::create(Type::XLSX);
$writer->setTempFolder($customTempFolderPath);
Strings storage (XLSX writer)

XLSX files support different ways to store the string values:

  • Shared strings are meant to optimize file size by separating strings from the sheet representation and ignoring strings duplicates (if a string is used three times, only one string will be stored)
  • Inline strings are less optimized (as duplicate strings are all stored) but is faster to process

In order to keep the memory usage really low, Spout does not optimize strings when using shared strings. It is nevertheless possible to use this mode.

use Box\Spout\Writer\WriterFactory;
use Box\Spout\Common\Type;

$writer = WriterFactory::create(Type::XLSX);
$writer->setShouldUseInlineStrings(true); // default (and recommended) value
$writer->setShouldUseInlineStrings(false); // will use shared strings

Note on Apple Numbers and iOS support
Apple’s products (Numbers and the iOS previewer) don’t support inline strings and display empty cells instead. Therefore, if these platforms need to be supported, make sure to use shared strings!

Date/Time formatting

When reading a spreadsheet containing dates or times, Spout returns the values by default as DateTime objects.
It is possible to change this behavior and have a formatted date returned instead (e.g. “2016-11-29 1:22 AM”). The format of the date corresponds to what is specified in the spreadsheet.

use Box\Spout\Reader\ReaderFactory;
use Box\Spout\Common\Type;

$reader = ReaderFactory::create(Type::XLSX);
$reader->setShouldFormatDates(false); // default value
$reader->setShouldFormatDates(true); // will return formatted dates
Playing with sheets

When creating a XLSX or ODS file, it is possible to control which sheet the data will be written into. At any time, you can retrieve or set the current sheet:

$firstSheet = $writer->getCurrentSheet();
$writer->addRow($rowForSheet1); // writes the row to the first sheet

$newSheet = $writer->addNewSheetAndMakeItCurrent();
$writer->addRow($rowForSheet2); // writes the row to the new sheet

$writer->setCurrentSheet($firstSheet);
$writer->addRow($anotherRowForSheet1); // append the row to the first sheet

It is also possible to retrieve all the sheets currently created:

$sheets = $writer->getSheets();

If you rely on the sheet’s name in your application, you can access it and customize it this way:

// Accessing the sheet name when reading
foreach ($reader->getSheetIterator() as $sheet) {
    $sheetName = $sheet->getName();
}

// Accessing the sheet name when writing
$sheet = $writer->getCurrentSheet();
$sheetName = $sheet->getName();

// Customizing the sheet name when writing
$sheet = $writer->getCurrentSheet();
$sheet->setName('My custom name');

Please note that Excel has some restrictions on the sheet’s name:

  • it must not be blank
  • it must not exceed 31 characters
  • it must not contain these characters: \ / ? * : [ or ]
  • it must not start or end with a single quote
  • it must be unique

Handling these restrictions is the developer’s responsibility. Spout does not try to automatically change the sheet’s name, as one may rely on this name to be exactly what was passed in.

Fluent interface

Because fluent interfaces are great, you can use them with Spout:

use Box\Spout\Writer\WriterFactory;
use Box\Spout\Common\Type;

$writer = WriterFactory::create(Type::XLSX);
$writer->setTempFolder($customTempFolderPath)
       ->setShouldUseInlineStrings(true)
       ->openToFile($filePath)
       ->addRow($headerRow)
       ->addRows($dataRows)
       ->close();
Running tests

On the master branch, only unit and functional tests are included. The performance tests require very large files and have been excluded.

If you just want to check that everything is working as expected, executing the tests of the master branch is enough.

If you want to run performance tests, you will need to checkout the perf-tests branch. Multiple test suites can then be run, depending on the expected output:

  • phpunit - runs the whole test suite (unit + functional + performance tests)
  • phpunit --exclude-group perf-tests - only runs the unit and functional tests
  • phpunit --group perf-tests - only runs the performance tests

For information, the performance tests take about 30 minutes to run (processing 1 million rows files is not a quick thing).

public function importExcel(): Json { // 设置脚本执行时间无限制,避免大数据处理超时 set_time_limit(0); // 设置脚本最大内存使用量为2GB,处理大文件时避免内存溢出 ini_set('memory_limit', '2048M'); // 获取请求参数(包含上传的文件信息) $params = $this->request->param(); // 验证文件是否上传 if (empty($params["file"])) { return $this->fail('请上传要导入文件'); } // 定义Excel表头与数据库字段的映射关系 $titleArr = [ '商家编号' => 'merchant_code', '业务单号' => 'business_order_number', '平台订单号' => 'platform_order_number', '商家订单号' => 'merchant_order_number', '退回件关联运单号' => 'return_waybill_number', '店铺名称' => 'shop_name', '下单时间' => 'order_time', '始发省' => 'origin_province', '始发市' => 'origin_city', '目的省' => 'destination_province', '目的市' => 'destination_city', '计费重量' => 'billing_weight', '实际重量(kg)' => 'actual_weight_kg', '实际体积(cm3)' => 'actual_volume_cm3', '首重单价' => 'first_weight_price', '续重单价' => 'continued_weight_price', '包裹/耗材数量' => 'package_material_quantity', '销售平台订单号' => 'sales_platform_order_number', '费用类型' => 'fee_type', '原始金额' => 'original_amount', '折扣/促销金额' => 'discount_promotion_amount', '计费时间' => 'billing_time', '结算金额' => 'settlement_amount', '签收时间' => 'sign_time', '商品编码' => 'product_code', '商品名称' => 'product_name', '实际出库数量' => 'actual_shipment_quantity', '续重单价2' => 'continued_weight_price2' ]; try { // 构建上传文件的完整路径 $path = app()->getRootPath() . "public/" . $params['file']; // 1. 创建Excel读取器并设置为只读取数据模式(优化内存) $reader = IOFactory::createReaderForFile($path); $reader->setReadDataOnly(true); // 2. 获取工作表基本信息(总行数和最大列字母),避免加载整个文件 $worksheetInfo = $reader->listWorksheetInfo($path); $totalRows = $worksheetInfo[0]['totalRows']; $highestColumn = $worksheetInfo[0]['lastColumnLetter']; // 3. 读取表头行(仅加载第一行数据) $chunkFilter = new ChunkReadFilter(); $chunkFilter->setRows(1, 1); $reader->setReadFilter($chunkFilter); $spreadsheet = $reader->load($path); $sheet = $spreadsheet->getSheet(0); $title = $sheet->rangeToArray('A1:' . $highestColumn . '1', null, true, true, true)[1]; // 释放不再使用的Spreadsheet对象,减少内存占用 $spreadsheet->disconnectWorksheets(); unset($spreadsheet); // 检查表头是否存在 if (empty($title)) { throw new \Exception('Excel表头为空'); } // 初始化数据库模型和计数器 $model = new DeliveryPrimaryRegionsItemizationModel(); //TODO:根据自身mysql所在服务的性能来控制批量插入大小 不然报错或者导入很慢 该值需要反复测验找到最合适的大小 $insertBatchSize = 100; // 数据库插入批次大小 $total = 0; // 总记录数 $success = 0; // 成功处理记录数 $failedBatches = []; // 失败批次记录 $batchData = []; // 批次数据临时存储 $processedRows = 0; // 已处理行数 // 4. 分块处理Excel数据,每次处理$chunkSize行 //TODO:根据自身服务器性能来决定每次分块读取excel表格数据的条数 不然报错或者导入很慢 该值需要反复测验找到最合适的大小 $chunkSize = 500; // 每次加载5000行 $chunkFilter = new ChunkReadFilter(); $reader->setReadFilter($chunkFilter); // 循环处理每个数据块 for ($startRow = 2; $startRow <= $totalRows; $startRow += $chunkSize) { $endRow = min($startRow + $chunkSize - 1, $totalRows); $chunkFilter->setRows($startRow, $chunkSize); // 5. 加载当前数据块 $spreadsheet = $reader->load($path); $sheet = $spreadsheet->getSheet(0); // 6. 处理当前块中的每一行数据 for ($row = $startRow; $row <= $endRow; $row++) { // 获取当前行数据 $rowData = $sheet->rangeToArray('A' . $row . ':' . $highestColumn . $row, null, true, true, true)[$row]; $item = []; // 将Excel列名映射为数据库字段名 foreach ($rowData as $colIndex => $value) { $colName = $title[$colIndex] ?? ''; if ($colName && isset($titleArr[$colName])) { $item[$titleArr[$colName]] = trim($value ?? ''); // 原始值去除首尾空格 } } // print_r($item);die; // 如果行数据不为空,则添加到批次数据中 if (!empty($item)) { $batchData[] = $item; $processedRows++; } // 7. 当批次数据达到插入批次大小时,执行数据库插入操作 if (count($batchData) >= $insertBatchSize) { $this->processBatch($model, $batchData, $success, $failedBatches, $total); $batchData = []; // 清空批次数据 } $total++; // 总记录数递增 } // 8. 释放当前数据块占用的内存 $spreadsheet->disconnectWorksheets(); unset($spreadsheet); gc_collect_cycles(); // 触发垃圾回收 // 9. 输出处理进度信息 $progress = round(($endRow / $totalRows) * 100, 2); echo "处理进度: {$progress}% | 已处理: {$processedRows} 行\n"; } // 10. 处理剩余不足一个批次的数据 if (!empty($batchData)) { $this->processBatch($model, $batchData, $success, $failedBatches, $total); } // 返回处理结果 if (empty($failedBatches)) { $resultMsg = "导入成功,共{$total}条数据,成功导入{$success}条"; return $this->success($resultMsg); } else { $errorCount = $total - $success; $errorMsg = "失败 {$errorCount} 条,第一个错误: {$failedBatches[0]['error']}"; return $this->fail('导入失败', ['remark' => $errorMsg]); } } catch (\Exception $e) { // 异常处理,返回错误信息 return $this->fail('导入过程中出错: ' . $e->getMessage()); } } /** * 处理数据批次(优化SQL构建) */ private function processBatch($model, &$batchData, &$success, &$failedBatches, &$total) { // 如果批次数据为空则直接返回 if (empty($batchData)) return; try { // 使用参数绑定构建SQL语句,防止SQL注入 $tableName = $model->getTable(); $fields = array_keys($batchData[0]); $fieldList = '`' . implode('`, `', $fields) . '`'; $placeholders = '(' . implode(',', array_fill(0, count($fields), '?')) . ')'; $sql = "INSERT INTO {$tableName} ({$fieldList}) VALUES "; $values = []; $params = []; // 构建批量插入的SQL语句和参数数组 foreach ($batchData as $row) { $values[] = $placeholders; foreach ($fields as $field) { $params[] = $row[$field] ?? null; } } $sql .= implode(',', $values); // 执行带参数绑定的SQL语句 // 一定不要使用模型去操作数据库 不然效率很低 原生的最好! Db::execute($sql, $params); // 更新成功记录数 $success += count($batchData); // 清空批次数据 $batchData = []; } catch (\Exception $e) { // 记录失败批次信息 $failedBatches[] = [ 'startRow' => $total - count($batchData) + 1, 'endRow' => $total, 'error' => $e->getMessage() ]; // 出错时清空批次数据,避免重复处理 $batchData = []; } } /** * 分块读取过滤器 * 用于控制只读取Excel文件中的指定行范围 * common.php代码不规范导致本地git有差异直接还原 直接写在控制器里算了! */ class ChunkReadFilter implements \PhpOffice\PhpSpreadsheet\Reader\IReadFilter { private $startRow = 0; private $endRow = 0; // 设置要读取的起始行和行数 public function setRows($startRow, $chunkSize) { $this->startRow = $startRow; $this->endRow = $startRow + $chunkSize; } // 判断当前单元格是否应该被读取 public function readCell($column, $row, $worksheetName = '') { // 只读取指定范围内的行 if ($row >= $this->startRow && $row <= $this->endRow) { return true; } return false; } } 这是我读取excel表格的php的写法,调用importExcel方法开始导入excel表格数据,但是总是报错502 请帮我分析一下这是为什么呢?另外我的nginx的配置增加了如下配置: # 代理相关超时配置(适用于反向代理场景) 3600s是1个小时 如果遇到大的excel表格的导入可设置7200或者更大 proxy_connect_timeout 3600; # 与上游服务器的连接超时 proxy_read_timeout 3600; # 从上游服务器读取响应的超时 proxy_send_timeout 3600; # 发送请求到上游服务器的超时 # 或者使用fastcgi相关超时配置(如果直接连接PHP-FPM) 3600s是1个小时 如果遇到大的excel表格的导入可设置7200或者更大 fastcgi_connect_timeout 3600; fastcgi_read_timeout 3600; fastcgi_send_timeout 3600; client_max_body_size 100M; client_body_buffer_size 1024k;
07-09
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值