PHP如何切割excel大文件

利用phpspreadsheet可以轻松的解析excel文件,但是phpspreadsheet的内存消耗也是比较大的,我试过解析将近5M的纯文字excel内存使用量就会超过php默认的最大内存128M。

当然这可以用调节内存大小的方法来解决,但是在并发量大的时候就比较危险了。所以今天介绍下一种方法,利用phpspreadsheet对excel文件进行切割,这是个拿时间换空间的方法所以一般对时效性要求低的需求可以使用。

方法:

先放个phpspreadsheet官网提供的一个功能readCell,我们就可以利用这个功能来进行切割。

首先对excel文件进行预读,主要是获取所有的工作表以及工作表下面的数据行数,这个阶段readCell方法一直返回的都是false,我们只需要记录readCell进来的工作表及数据行数。

然后就是对获取到的记录进行分析,确定每部分数据需要装多少行原始excel的数据,需要注意的是为了避免内容混淆,不要将两个工作表的内容切到一起。

最后就是循环分析的数据和再次利用readCell获取每部分数据,注意每次读取文件后都要利用disconnectWorksheets方法清理phpspreadsheet的内存。

经过测试发现,利用该方法解析5M的excel文件,平均只需要21M的内存就可以搞定!

代码

<?php
namespace CutExcel;
require_once 'PhpSpreadsheet/autoload.php';
/**
* 预读过滤类
* @author wangyelou
* @date 2018-07-30
*/
class MyAheadreadFilter implements \PhpOffice\PhpSpreadsheet\Reader\IReadFilter
{
public $record = array();
private $lastRow = '';
public function readCell($column, $row, $worksheetName = '')
{
if (isset($this->record[$worksheetName]) ) {
if ($this->lastRow != $row) {
$this->record[$worksheetName] ++;
$this->lastRow = $row;
}
} else {
$this->record[$worksheetName] = 1;
$this->lastRow = $row;
}
return false;
}
}
/**
* 解析过滤类
* @author wangyelou
* @date 2018-07-30
*/
class MyreadFilter implements \PhpOffice\PhpSpreadsheet\Reader\IReadFilter
{
public $startRow;
public $endRow;
public $worksheetName;
public function readCell($column, $row, $worksheetName = '')
{
if ($worksheetName == $this->worksheetName && $row >= ($this->startRow+1) && $row <= ($this->endRow+1)) {
return true;
}
return false;
}
}
/**
* 切割类
* @author wangyelou
* @date 2018-07-30
*/
class excelCut
{
public $cutNum = 5;
public $returnType = 'Csv';
public $fileDir = '/tmp/';
public $log;
/**
* 切割字符串
* @param $str
* @return array|bool
*/
public function cutFromStr($str)
{
try {
$filePath = '/tmp/' . time() . mt_rand(1000, 9000) . $this->returnType;
file_put_contents($filePath, $str);
if (file_exists($filePath)) {
$result = $this->cutFromFile($filePath);
unlink($filePath);
return $result;
} else {
throw new Exception('文件写入错误');
}
} catch (Exception $e) {
$this->log = $e->getMessage();
return false;
}
}
/**
* 切割文件
* @param $file
* @return array|bool
*/
public function cutFromFile($file)
{
try {
$cutRules = $this->readaheadFromFile($file);
$dir = $this->getFileDir($file);
$returnType = $this->returnType ? $this->returnType : 'Csv';
$results = array();
//初始化读
$myFilter = new MyreadFilter();
$inputFileType = \PhpOffice\PhpSpreadsheet\IOFactory::identify($file);
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
$reader->setReadDataOnly(true);
$reader->setReadFilter($myFilter);
foreach ($cutRules as $sheetName => $rowIndexRange) {
//读
list($myFilter->startRow, $myFilter->endRow, $myFilter->worksheetName) = $rowIndexRange;
$spreadsheetReader = $reader->load($file);
$sheetData = $spreadsheetReader->setActiveSheetIndexByName($myFilter->worksheetName)->toArray(null, false, false, false);
$realDatas = array_splice($sheetData, $myFilter->startRow, ($myFilter->endRow - $myFilter->startRow + 1));
$spreadsheetReader->disconnectWorksheets();
unset($sheetData);
unset($spreadsheetReader);
//写
$saveFile = $dir . $sheetName . '.' . $returnType;
$spreadsheetWriter = new \PhpOffice\PhpSpreadsheet\Spreadsheet();
foreach ($realDatas as $rowIndex => $row) {
foreach ($row as $colIndex => $col) {
$spreadsheetWriter->getActiveSheet()->setCellValueByColumnAndRow($colIndex+1, $rowIndex+1, $col);
}
}
$writer = \PhpOffice\PhpSpreadsheet\IOFactory::createWriter($spreadsheetWriter, $returnType);
$writer->save($saveFile);
$spreadsheetWriter->disconnectWorksheets();
unset($spreadsheetWriter);
$results[] = $saveFile;
}
return $results;
} catch (Exception $e) {
$this->log = $e->getMessage();
return false;
}
}
/**
* 预读文件
*/
public function readaheadFromFile($file)
{
if (file_exists($file)) {
//获取统计数据
$myFilter = new MyAheadreadFilter();
$inputFileType = \PhpOffice\PhpSpreadsheet\IOFactory::identify($file);
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
$reader->setReadDataOnly(true); //只读数据
$reader->setReadFilter($myFilter);
$spreadsheet = $reader->load($file);
//$sheetData = $spreadsheet->getActiveSheet()->toArray(null, false, false, false);
list($fileName,) = explode('.', basename($file));
$datas = array();
$averageNum = ceil(array_sum($myFilter->record) / $this->cutNum);
foreach ($myFilter->record as $sheetName => $count) {
for ($i=0; $i<ceil($count/$averageNum); $i++) {
$datas[$fileName . '_' . $sheetName . '_' . $i] = array($i*$averageNum, ($i+1)*$averageNum-1, $sheetName);
}
}
return $datas;
} else {
throw new Exception($file . ' not exists');
}
}
/**
* 创建目录
* @param $file
* @return bool|string
*/
protected function getFileDir($file)
{
$baseName = basename($file);
list($name) = explode('.', $baseName);
$fullName = $name .'_'. time() . '_' . mt_rand(1000, 9999);
$path = $this->fileDir . $fullName . '/';
mkdir($path, 0777);
chmod($path, 0777);
if (is_dir($path)) {
return $path;
} else {
$this->log = "mkdir {$path} failed";
return false;
}
}
}
赞(0)
声明:本网站发布的内容(图片、视频和文字)以原创、转载和分享网络内容为主,如果涉及侵权请尽快告知,我们将会在第一时间删除。文章观点不代表本网站立场,如需处理请联系客服。电话:028-62778877-8306;邮箱:fanjiao@west.cn。本站原创内容未经允许不得转载,或转载时需注明出处:西部数码知识库 » PHP如何切割excel大文件

登录

找回密码

注册