系统设计:
通过FPGA+W5300的方案,实现TCP/IP协议和JTAG协议,选择带Cortex M3硬核的FPGA用于实现XVC协议栈解析及自定义协议解析
JTAG 时序解析
由于Debug Bridge只能在Xilinx的FPGA上运行,此处仿真获取JTAG的时序并手写模块实现JTAG协议转换
使用VIVADO仿真xilinx的Debug Bridge IP,获取JTAG相关的时序信息
repeat(200)@(posedge tb_ACLK);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0000,4, 32'd32, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0004,4, 32'haaaaaaaa, resp);//TMS
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0008,4, 32'h55555555, resp);//TDI
//tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_000c,4, 32'h0, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0010,4, 32'd1, resp);
for(index=0;index<20;index=index+1) begin
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.read_data(32'h43C0_0010,4,read_data_10, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.read_data(32'h43C0_000C,4,read_data_0c,resp);
repeat(10)@(posedge tb_ACLK);
end
repeat(200)@(posedge tb_ACLK);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0000,4, 32'd32, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0004,4, 32'hffffffff, resp);//TMS
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0008,4, 32'h0, resp);//TDI
//tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_000c,4, 32'h0, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0010,4, 32'd1, resp);
for(index=0;index<20;index=index+1) begin
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.read_data(32'h43C0_0010,4,read_data_10, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.read_data(32'h43C0_000C,4,read_data_0c,resp);
repeat(10)@(posedge tb_ACLK);
end
repeat(200)@(posedge tb_ACLK);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0000,4, 32'd32, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0004,4, 32'h11223344, resp);//TMS
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0008,4, 32'h55aa55aa, resp);//TDI
//tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_000c,4, 32'h0, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0010,4, 32'd1, resp);
for(index=0;index<20;index=index+1) begin
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.read_data(32'h43C0_0010,4,read_data_10, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.read_data(32'h43C0_000C,4,read_data_0c,resp);
repeat(10)@(posedge tb_ACLK);
end
repeat(200)@(posedge tb_ACLK);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0000,4, 32'd3, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0004,4, 32'h3344, resp);//TMS
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0008,4, 32'h1122, resp);//TDI
//tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_000c,4, 32'h0, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0010,4, 32'd1, resp);
for(index=0;index<20;index=index+1) begin
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.read_data(32'h43C0_0010,4,read_data_10, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.read_data(32'h43C0_000C,4,read_data_0c,resp);
repeat(10)@(posedge tb_ACLK);
end
repeat(200)@(posedge tb_ACLK);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0000,4, 32'd2, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0004,4, 32'h3344, resp);//TMS
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0008,4, 32'h1122, resp);//TDI
//tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_000c,4, 32'h0, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0010,4, 32'd1, resp);
for(index=0;index<20;index=index+1) begin
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.read_data(32'h43C0_0010,4,read_data_10, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.read_data(32'h43C0_000C,4,read_data_0c,resp);
repeat(10)@(posedge tb_ACLK);
end
repeat(200)@(posedge tb_ACLK);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0000,4, 32'd2, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0004,4, 32'h3344, resp);//TMS
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0008,4, 32'h1122, resp);//TDI
//tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_000c,4, 32'h0, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0010,4, 32'd1, resp);
for(index=0;index<20;index=index+1) begin
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.read_data(32'h43C0_0010,4,read_data_10, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.read_data(32'h43C0_000C,4,read_data_0c,resp);
repeat(10)@(posedge tb_ACLK);
end
repeat(200)@(posedge tb_ACLK);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0000,4, 32'd1, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0004,4, 32'h3344, resp);//TMS
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0008,4, 32'h1122, resp);//TDI
//tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_000c,4, 32'h0, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.write_data(32'h43C0_0010,4, 32'd1, resp);
for(index=0;index<20;index=index+1) begin
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.read_data(32'h43C0_0010,4,read_data_10, resp);
tb.zynq_sys.base_zynq_i.processing_system7_0.inst.read_data(32'h43C0_000C,4,read_data_0c,resp);
repeat(10)@(posedge tb_ACLK);
end
时序仿真结果:
Debug Bridge IP |
TARGET DEVICE | |
TDO |
tck上升沿采样 |
TCK下降沿输出 |
TDI |
tck下降沿输出,注意第一个数据产生时候TCK信号没有输出 |
TCK上升沿采样 |
TMS |
tck下降沿输出,注意第一个数据产生时候TCK信号没有输出 |
TCK上升沿采样 |
TCK |
|
W5300接口时序
和常见的异步接口比较类似,使用verilog按照接口的时序进行处理即可,高低温压测无数据异常
W5300初始化相关逻辑
进行w5300的复位及ip地址/mac地址配置
将协议栈配置为TCP/IP
配置socket的相关信息
int8_t wizchip_init(uint8_t* txsize, uint8_t* rxsize)
{
int8_t i;
#if _WIZCHIP_ < W5200
int8_t j;
#endif
int8_t tmp = 0;
wizchip_sw_reset();
if(txsize)
{
tmp = 0;
//M20150601 : For integrating with W5300
#if _WIZCHIP_ == W5300
for(i = 0 ; i < _WIZCHIP_SOCK_NUM_; i++)
{
if(txsize[i] > 64) return -1; //No use 64KB even if W5300 support max 64KB memory allocation
tmp += txsize[i];
if(tmp > 128) return -1;
}
if(tmp % 8) return -1;
#else
for(i = 0 ; i < _WIZCHIP_SOCK_NUM_; i++)
{
tmp += txsize[i];
#if _WIZCHIP_ < W5200 //2016.10.28 peter add condition for w5100 and w5100s
if(tmp > 8) return -1;
#else
if(tmp > 16) return -1;
#endif
}
#endif
for(i = 0 ; i < _WIZCHIP_SOCK_NUM_; i++)
{
#if _WIZCHIP_ < W5200 //2016.10.28 peter add condition for w5100
j = 0;
while((txsize[i] >> j != 1)&&(txsize[i] !=0)){j++;}
setSn_TXBUF_SIZE(i, j);
#else
setSn_TXBUF_SIZE(i, txsize[i]);
#endif
}
}
if(rxsize)
{
tmp = 0;
#if _WIZCHIP_ == W5300
for(i = 0 ; i < _WIZCHIP_SOCK_NUM_; i++)
{
if(rxsize[i] > 64) return -1; //No use 64KB even if W5300 support max 64KB memory allocation
tmp += rxsize[i];
if(tmp > 128) return -1;
}
if(tmp % 8) return -1;
#else
for(i = 0 ; i < _WIZCHIP_SOCK_NUM_; i++)
{
tmp += rxsize[i];
#if _WIZCHIP_ < W5200 //2016.10.28 peter add condition for w5100 and w5100s
if(tmp > 8) return -1;
#else
if(tmp > 16) return -1;
#endif
}
#endif
for(i = 0 ; i < _WIZCHIP_SOCK_NUM_; i++)
{
#if _WIZCHIP_ < W5200 // add condition for w5100
j = 0;
while((rxsize[i] >> j != 1)&&(txsize[i] !=0)){j++;}
setSn_RXBUF_SIZE(i, j);
#else
setSn_RXBUF_SIZE(i, rxsize[i]);
#endif
}
}
return 0;
}
socket处理状态机:
uint32_t socket_fsm_ctrl_socket0(int s)
{
uint8_t Sn_SR_s = getSn_SR(s);
uint16_t fifo_len;
switch(Sn_SR_s)
{
case SOCK_ESTABLISHED:
// Interrupt clear
if(getSn_IR(s) & Sn_IR_CON)
{
setSn_IR(s, Sn_IR_CON);
}
volatile uint8_t *recv_data_s0 = (volatile uint8_t * volatile const)FPGA_ETH_BASE_ADDR;
uint8_t ret = -1;
do {
fifo_len = recv(0, (uint8_t *)recv_data_s0, 4096);
if(fifo_len == 0) {
break;
}
if(fifo_len > 4096){
break;
}
ret = xvc_data_parser((uint8_t *)recv_data_s0,fifo_len);
if(ret == 0)
{
return 0;
}
} while(0);//(fifo_len > 0);
break;
case SOCK_CLOSE_WAIT:
disconnect(s);
break;
case SOCK_CLOSED:
if(socket(s, Sn_MR_TCP, socket_port[s], SF_IO_NONBLOCK) == s) /* Reinitialize the socket */
{
}
break;
case SOCK_INIT:
if(socket(s, Sn_MR_TCP, socket_port[s], SF_IO_NONBLOCK) == s) /* Reinitialize the socket */
{
}
listen(s);
break;
case SOCK_LISTEN:
break;
default :
break;
} // end of switch
}
XVC协议栈
按照XVC 1.0协议栈进行移植,处理getinfo: settck: shift:三条指令
https://github.com/Xilinx/XilinxVirtualCable
The XVC 1.0 communication protocol consists of the following three messages:
getinfo: settck:<period in ns> shift:<num bits><tms vector><tdi vector>
For each message the client is expected to send the message and wait for a
response from the server. The server needs to process each message in the order
recieved and promptly provide a response. Note that for the XVC 1.0 protocol
only one connection is assumed so as to avoid interleaving locking and
interleaving issues that may occur with concurrent client communication.
MESSAGE: "getinfo:"
The primary use of "getinfo:" message is to get the XVC server version. The
server version provides a client a way of determining the protocol capabilites
of the server.
Syntax
Client Sends:
"getinfo:"
Server Returns:
“xvcServer_v1.0:<xvc_vector_len>\n”
Where:
<xvc_vector_len> is the max width of the vector that can be shifted into the server
MESSAGE: "settck:"
The "settck:" message configures the server TCK period. When sending JTAG
vectors the TCK rate may need to be varied to accomodate cable and board
signal integrity conditions. This command is used by clients to adjust the TCK
rate in order to slow down or speed up the shifting of JTAG vectors.
Syntax:
Client Sends:
"settck:<set period>"
Server Returns:
“<current period>”
Where:
<set period> is TCK period specified in ns. This value is a little-endian integer value. <current period> is the value set on the server by the settck command. If the server cannot set the value then it will return the current value.
MESSAGE: "shift:"
The "shift:" message is used to shift JTAG vectors in and out of a device.
The number of bits to shift is specified as the first shift command parameter
followed by the TMS and TDI data vectors. The TMS and TDI vectors are
sized according to the number of bits to shift, rouneded to the nearest byte.
For instance if shifting in 13 bits the byte vectors will be rounded to 2
bytes. Upon completion of the JTAG shift operation the server will return a
byte sized vector containing the sampled target TDO value for each shifted
TCK clock.
Syntax:
Client Sends:
"shift:<num bits><tms vector><tdi vector>"
Server Returns:
“<tdo vector>”
Where:
<num bits> : is a integer in little-endian mode. This represents the number of TCK clk toggles needed to shift the vectors out <tms vector> : is a byte sized vector with all the TMS shift in bits Bit 0 in Byte 0 of this vector is shifted out first. The vector is num_bits and rounds up to the nearest byte. <tdi vector> : is a byte sized vector with all the TDI shift in bits Bit 0 in Byte 0 of this vector is shifted out first. The vector is num_bits and rounds up to the nearest byte. <tdo vector> : is a byte sized vector with all the TDO shift out bits Bit 0 in Byte 0 of this vector is shifted out first. The vector is num_bits and rounds up to the nearest byte.
协议栈验证
Open Hardware Manager -> localhost(右键)->Add Xilinx Virtual Cable(XVC)
Host name: 192.168.31.6
Port: 2542
协议栈加速
直接移植的XVC协议栈速度极慢,进行分析发现是XVC数据从W5300->FPGA->M3链路延时较长,加入DMA逻辑将XVC的socket数据直接从W5300搬运到M3的内存,实测速率从~100kbps提升到~5Mbps
修改JTAG协议的TCK时钟速率,可实现速度从5Mbps提升到~7Mbps
实测W5300百兆速率还是有些慢,TCPIP传输时间占用的时间较长,加上M3的性能相对较弱,W5300+FPGA的方案相比于ZYNQ的XVC方案相比速率还是慢一些。主要是成本和开机时间及功耗有些优势。ZYNQ核心板价格较贵,自己做ZYNQ的板子又太复杂,放弃改方案
下一步优化计划:将W5300的8位并行读写接口修改为16位并行接口,理论上速率可以优化到~13Mbps,可以满足日常使用需求
后续可能的方案:
1. 验证了HPM的MCU,TCP/IP实测可以跑到~500-800Mbps,MCU主频可以运行到600Mhz,可以大幅提高XVC速率
自定义协议
-
使用自定义协议读取FPGA内部的寄存器,用于调试(done)
-
使用自定义读写FLASH内容,用于IP和MAC地址的获取和配置(done)
-
使用自定义协议配置JTAG的TCK速率,用于适配不同的硬件环境(done)
-
使用自定义协议进行SPI和I2C接口外设的转换,用于日常调试使用(done)
-
使用自定义协议进行GPIO的控制,用于日常调试使用(done)
PCB改板事项
-
W5300 LINK引脚连接到FPGA引脚用于网线断开后进行初始化
-
w5300数据位宽修改为16bit
-
网络和JTAG添加ESD保护芯片,JTAG添加电平转换芯片