The Challenge of High-Speed Data
In high-performance embedded systems, the bottleneck is rarely the processing power itself, but the movement of data. Using the Zynq-7000's AXI High-Performance (HP) Slave ports provides a dedicated, low-latency pathway between Programmable Logic (PL) and DDR memory. However, this efficiency comes at a cost: the need for manual cache management.
Required Materials
- Hardware: Cora Z7-07S (Zynq-7000)
- Toolchain: Xilinx Design Pack 2018.3 (Vivado + SDK)
- Monitoring: Tera Term (Serial) & HxD (Binary analysis)
DMA Polling Architecture

Figure 1: The AXI DMA acts as the "mover," bridging the gap between DDR memory and custom logic streams.
Mastering Cache Coherency
Because the HP ports operate in a non-coherent domain, the DMA controller is oblivious to the CPU's L1/L2 caches. Failing to synchronize leads to "stale data" bugs where the CPU reads what it *thinks* is in memory, while the actual data sits in the cache.
- Xil_DCacheFlushRange: Essential before a DMA Read (moving RAM to Peripheral) to push cache updates to DDR.
- Xil_DCacheInvalidateRange: Vital after a DMA Write (Peripheral to RAM) to discard stale CPU cache lines.
Configure the hardware
XAxiDma_Config *CfgPtr;
CfgPtr = XAxiDma_LookupConfig(DeviceId);
if (!CfgPtr) {
xil_printf("No config found for %d
", DeviceId);
return XST_FAILURE;
}
Status = XAxiDma_CfgInitialize(&AxiDma, CfgPtr);
if (Status != XST_SUCCESS) {
xil_printf("Initialization failed %d
", Status);
return XST_FAILURE;
}Disable interrupts for polling mode
// s2mm_intr: Device to DMA (Memory) Interrupt
XAxiDma_IntrDisable(&AxiDma, XAXIDMA_IRQ_ALL_MASK,
XAXIDMA_DEVICE_TO_DMA);
// mm2s_intr: DMA (Memory) to Device Interrupt
XAxiDma_IntrDisable(&AxiDma, XAXIDMA_IRQ_ALL_MASK,
XAXIDMA_DMA_TO_DEVICE);
// Generate TxBuffer Data
...
// Flush the cache to ensure data coherency
Xil_DCacheFlushRange((UINTPTR)TxBufferPtr, MAX_PKT_LEN);
Xil_DCacheFlushRange((UINTPTR)RxBufferPtr, MAX_PKT_LEN);
Initiate the transfer
// S2MM: Device to DMA (Memory)
Status = XAxiDma_SimpleTransfer(&AxiDma,(UINTPTR) RxBufferPtr,
MAX_PKT_LEN, XAXIDMA_DEVICE_TO_DMA);
// MM2S: DMA (Memory) to Device
Status = XAxiDma_SimpleTransfer(&AxiDma,(UINTPTR) TxBufferPtr,
MAX_PKT_LEN, XAXIDMA_DMA_TO_DEVICE);
// Polling for completion
while ((XAxiDma_Busy(&AxiDma,XAXIDMA_DEVICE_TO_DMA)) ||
(XAxiDma_Busy(&AxiDma,XAXIDMA_DMA_TO_DEVICE))) {
/* Wait */
}
// Invalidate the cache to ensure data coherency
Xil_DCacheInvalidateRange((UINTPTR)RxPacket, MAX_PKT_LEN);