DMA in Solaris kernel

本文详细介绍了Solaris系统中Direct Virtual Memory Access (DVMA) 和 Direct Memory Access (DMA) 的工作原理和技术细节,包括如何设置DVMA资源、使用IOMMU进行地址转换以及一致性和流式DMA的区别。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Direct Memory Access

Solaris 2.x supports two types of memory access: Direct Memory Access (DMA) and Direct Virtual Memory Access (DVMA). The difference between the two types of memory access is the type of memory address provided by the system to PCI devices. See the Chapter on DMA in Writing Device Drivers for a general discussion of DDI functions for DVMA.

On platforms that support DVMA, the system provides a virtual address to the device to perform transfers. Typically, the system that supports SPARC platforms uses DVMA for memory access. Sun SPARC platforms support DVMA through the IOMMU in the HPB.

The driver establishes mappings between DMA objects (memory buffer) and DVMA resources (DVMA virtual pages in kernel address space) during DVMA setup. During DVMA transfer, the driver passes DVMA resources to the DMA engine of the PCI device. The DMA engine sends DVMA resources to the IOMMU to load the physical address of the DMA object. The driver specifies the DMA characteristics of its device in the ddi_dma_attr(9S) structure.

DMA Objects

A DMA object is a contiguous physical memory that is represented by a kernel virtual address, a page I/O buf structure, or a locked down physical I/O buf structure. The memory object has to be locked (for example, the memory can't be claimed by other processes thus causing the data to be paged out) before it can be used as a DMA object. Using an unlocked DMA object for DVMA may result in unpredictable consequences.

Device drivers use ddi_dma_mem_alloc(9F) to allocate a private buffer for DVMA transfers. ddi_dma_mem_alloc(9F) returns a kernel virtual address backed by a locked physical memory. Device drivers should use ddi_dma_mem_alloc(9F) for DMA memory allocation rather than kmem_alloc(9F) because ddi_dma_mem_alloc(9F):

The kernel virtual address returned by ddi_dma_mem_alloc(9F) is allocated from the kernel resource map (kernelmap in the diagram on the Device Addressing Map page), which is a scarce resource and sometimes is more scarce than the physical memory in high-end configurations. In Sun4U platforms, the size of kernelmap is 2.5 Gbytes.

Physical memory represented by the page I/O buf structure is typically allocated by file systems. The page I/O buf structure is characterized by a B_PAGEIO in the b_flags field and is associated with a physical memory that has been locked down by the system. The page I/O buf structure is usually passed by file systems to the block driver's strategy(9E) entry point for data I/O.

The physical I/O buf structure is typically allocated in the character driver's read(9E) or write(9E) entry points and is passed to physio(9F) for performing data I/O. physio(9F) locks the memory associated with b_addr and sets the B_PHYS bit of the b_flags field.

DVMA Resources and IOMMU Translations

The DVMA resource is a contiguous virtual page region in the kernel virtual address space dedicated for DVMA (dvmamap in the diagram on the Device Addressing Map page). DVMA pages occupy the top end of the kernel virtual address space. There is a one-to-one correspondence between a DVMA page and a TSB entry. The size of DVMA address space is the number of entries in the TSB multiplying IOMMU_PAGESIZE (8 Kbytes). When the system has no resources to allocate, the system (depending upon the driver's option):

The system uses a dedicated address space region in the kernel address space for DVMA so that a contiguous virtual address can be allocated for fragmented physical memory. Although the kernel virtual address allocated from kernelmap, in theory, can also be used in the IOMMU translations, the system uses the virtual address from the dvmamap for loading the IOMMU to handle the device with limiting DMA capabilities (for example, less than 32-bit addressing). Since the system provides a contiguous virtual region to the device for DVMA transfers, the device does not need the scatter/gather DMA capability.

Device drivers use ddi_dma_addr_bind_handle(9F) or ddi_dma_buf_bind_handle(9F) to allocate DVMA resources and to load the DVMA virtual address and physical address pairs in the TSB entries. The allocated DVMA resources (32-bit kernel virtual address) are stored in DMA cookie (the dmac_address field of ddi_dma_cookie(9S)), which is used later in programming the device's DMA engine.

When a PCI device presents a 32-bit virtual address of the DVMA page (the dmac_address field of ddi_dma_cookie(9S)) to the IOMMU, the IOMMU searches the TLB to see if a VA-> PA translation is already available. If there is a miss in the TLB, the IOMMU performs TSB lookup. An error is returned to the PCI master device if TSB lookup fails to locate a valid mapping.

The IOMMU is involved in a DVMA transfer only if a translation from virtual address to physical address is required. There are other situations that do not require virtual to physical address translations, for example, a transfer from one PCI device to another PCI device on the same PCI bus, peer-to-peer transfer. In a peer-to-peer DVMA transfer, the PCI master device knows the physical address of the target device that has the same PCI bus base address of the PCI master device plus offset. The PCI master device simply puts the PCI address of the target device into its DMA engine. The system is intelligent enough not to allocate DVMA resources in the peer-to-peer mode.

Consistent vs. Streaming DMA

Sun SPARC platforms support both consistent and streaming DMA through the use of the STC in the HPB. ddi_dma_addr_bind_handle(9F) and ddi_dma_buf_bind_handle(9F) allow device drivers to specify which mode of DVMA access to use in the flag argument. The default mode is streaming DVMA unless DDI_DMA_CONSISTENT is specified. Streaming vs. consistent mode is enabled on a per page basis when the IOMMU loads the TSB entry.

For consistent reads, the HPB treats all three PCI memory read commands (MR, MRL, and MRM) identically. The HPB always generates complete 64-byte cache line requests on the UPA bus. The HPB returns the correct number of data to the PCI bus.

For streaming reads, the HPB passes information on which PCI read command was used to the STC so that the HPB can decide if a prefetch is performed. If MRL or MRM is used, a prefetch of the next increasing sequential block of memory is attempted. If the MR command is used, a prefetch is performed only if the last byte of the cache line is read.

For consistent writes, arbitrary byte transfers are allowed. Data to cacheable space is passed on to the Merge Buffer of the HPB for partial cache line writes. The HPB also enforces certain constraints between consistent writes (cacheable or non-cacheable) and the following synchronization events:

For streaming writes, arbitrary byte transfers are not allowed and data must be contiguous within a PCI transaction. The two PCI write commands, MW and MWI, are treated identically for both consistent and streaming DVMA. The IDU of the HPB delays the interrupt dispatch until write data has been sent to the memory.

The performance of consistent access can be worse than streaming access (particularly for reads or sub-line writes), and DVMA pages should only be marked consistent-mode when necessary. DVMA accesses to non-cacheable memory (for example, a frame buffer) is always treated in consistent mode.

Since streaming data does not participate in the system memory coherence protocol, the driver must call ddi_dma_sync(9F) to flush the STC when a streaming DVMA transfer is completed.

ddi_dma_attr(9S)

Device drivers use the ddi_dma_attr(9S) structure to specify the characteristics of the device's DMA engine.

struct ddi_dma_attr {

uint_t dma_attr_version;

unsigned long long dma_attr_addr_lo;

unsigned long long dma_attr_addr_hi;

unsigned long long dma_attr_count_max;

unsigned long long dma_attr_align;

uint_t dma_attr_burstsizes;

uint_t dma_attr_minxfer;

unsigned long long dma_attr_maxxfer;

unsigned long long dma_attr_seg;

int dma_attr_sgllen;

uint_t dma_attr_granular;

int dma_attr_flags;

} ddi_dma_attr_t;

 

 
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值