Base64 encode/decode large file

本文介绍了一种处理大文件Base64编码和解码的方法,避免了内存溢出的问题。通过逐块读取文件并进行转换,实现了对大型文件的有效处理。
The class System.Convert provide two basic methods "ToBase64String()" and "Convert.FromBase64String()" to encode a byte array to a base64 string and decode a base64 string to a byte array.

public   string  Encode( byte [] data)
ExpandedBlockStart.gifContractedBlock.gif
{
    
return Convert.ToBase64String(data);
}

        
public   byte [] Decode( string  strBase64)
ExpandedBlockStart.gifContractedBlock.gif
{
    
return Convert.FromBase64String(strBase64);
}

It is very good to use them to encode and decode base64. But in some case, it is a disaster.

For example, if you want to encode a 4 gb file to base64, the code above must throw an OutOfMemory exception., because you must read the file into a byte array. So we need to look for another way to encode and decode by base64.

Long days ago, a man have posted an article about how to deal with it.

http://blogs.microsoft.co.il/blogs/kim/archive/2007/10/09/base64-encode-large-files-very-large-files.aspx

This man use XmlWriter to work around it.

By researching the basis of the Base64 encoding in rfc, I found another more directly way to deal with it.

According rfc3548, base64 encode data in the unit of 3 bytes to 4 bytes, if the last part's length is less than 3,
the char '=' will be padded. So we can encode file in small chunks whose size is 3, then we can get the encoding data of the file by combiling encoding data of every chunks.

So I have below code:

public   void  EncodeFile( string  inputFile,  string  outputFile)
ExpandedBlockStart.gifContractedBlock.gif
{
            
using(FileStream inputStream = File.Open(inputFile, FileMode.Open, FileAccess.Read, FileShare.Read))
ExpandedSubBlockStart.gifContractedSubBlock.gif            
{
                
using(StreamWriter outputWriter = new StreamWriter(outputFile, false, Encoding.ASCII))
ExpandedSubBlockStart.gifContractedSubBlock.gif                
{
                    
byte[] data = new byte[3 * 1024]; //Chunk size is 3k
                    int read    = inputStream.Read(data, 0, data.Length);
                    
                    
while(read > 0)
ExpandedSubBlockStart.gifContractedSubBlock.gif                    
{
                        outputWriter.Write(Convert.ToBase64String(data, 
0, read));
                        read 
= inputStream.Read(data, 0, data.Length);
                    }

                    
                    outputWriter.Close();                    
                }

                
                inputStream.Close();
            }

        }


        
public   void  DecodeFile( string  inputFile,  string  outputFile)
ExpandedBlockStart.gifContractedBlock.gif        
{
            
using (FileStream inputStream = File.Open(inputFile, FileMode.Open, FileAccess.Read, FileShare.Read))
ExpandedSubBlockStart.gifContractedSubBlock.gif            
{
                
using (FileStream outputStream = File.Create(outputFile))
ExpandedSubBlockStart.gifContractedSubBlock.gif                
{
                    
byte[] data = new byte[4 * 1024]; //Chunk size is 4k
                    int read = inputStream.Read(data, 0, data.Length);

                    
byte[] chunk    = new byte[3 * 1024];
            
                    
while (read > 0)
ExpandedSubBlockStart.gifContractedSubBlock.gif                    
{
                        chunk 
= Convert.FromBase64String(Encoding.ASCII.GetString(data, 0, read));
                        outputStream.Write(chunk, 
0, chunk.Length);
                        read 
= inputStream.Read(data, 0, data.Length);
                    }


                    outputStream.Close();
                }


                inputStream.Close();
            }

        }


These methods also can be improved to support mime format (76 chars per line).

public   static   void  EncodeFile( string  inputFile,  string  outputFile)
ExpandedBlockStart.gifContractedBlock.gif        
{
            
using(FileStream inputStream = File.Open(inputFile, FileMode.Open, FileAccess.Read, FileShare.Read))
ExpandedSubBlockStart.gifContractedSubBlock.gif            
{
                
using(StreamWriter outputWriter = new StreamWriter(outputFile, false, Encoding.ASCII))
ExpandedSubBlockStart.gifContractedSubBlock.gif                
{
                    
byte[] data = new byte[57 * 1024]; //Chunk size is 57k
                    int read    = inputStream.Read(data, 0, data.Length);
                    
                    
while(read > 0)
ExpandedSubBlockStart.gifContractedSubBlock.gif                    
{
                        outputWriter.WriteLine(Convert.ToBase64String(data, 
0, read, Base64FormattingOptions.InsertLineBreaks));
                        read 
= inputStream.Read(data, 0, data.Length);
                    }

                    
                    outputWriter.Close();                    
                }

                
                inputStream.Close();
            }

        }


        
public   static   void  DecodeFile( string  inputFile,  string  outputFile)
ExpandedBlockStart.gifContractedBlock.gif        
{
            
using (StreamReader reader = new StreamReader(inputFile, Encoding.ASCII, true))
ExpandedSubBlockStart.gifContractedSubBlock.gif            
{
                
using (FileStream outputStream = File.Create(outputFile))
ExpandedSubBlockStart.gifContractedSubBlock.gif                
{                
                    
string line = reader.ReadLine();

                    
while (!string.IsNullOrEmpty(line))
ExpandedSubBlockStart.gifContractedSubBlock.gif                    
{
                        
if (line.Length > 76)
                            
throw new InvalidDataException("Invalid mime-format base64 file");

                        
byte[] chunk = Convert.FromBase64String(line);
                        outputStream.Write(chunk, 
0, chunk.Length);
                        line 
= reader.ReadLine();
                    }


                    outputStream.Close();
                }


                reader.Close();
            }

        }
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值