Search This Blog

Dec 26, 2007

Got fever

Got fever, have injection on hospital. Still feel no comfortable.

Dec 20, 2007

Arp Virus

We can not browse the web normally, it will pop up a window to show "The Earn the QQ...". The web page has beed embeded a link to another website. I have check it and found there mighe be Arp virus. use arp -a and found gateway MAC address is floating. Download the tools to scan whole company computer, and found our sale's computer MAC address match the result. Asked sale man disconect lan. And we can browse the web OK.

Dec 19, 2007


Today debug why armv5te does not work. And finally found that Multi compile the following code, It seems multi don't know #if 0.
;#if 0
; mov v1, #(1<<(COL_SHIFT-1))
; smlabt v2, ip, a4, v1 ;/* v2 = W4*col[1] + (1<<(COL_SHIF1)) */
; smlabb v1, ip, a4, v1 ;/* v1 = W4*col[0] + (1<<(COL_SHIF1)) */
; ldr a4, [a1, #(16*4)]
It works after comment the code, and I have tested the performance between simple_idct_arm.s, the put and add function speed increase 90%. However, I put armv5te code into whole project, the performance only improve 7%. Why this happens? The test code run in OCRAM, the project code run in SDRAM, there should have many memory access stall. So I mighe need consider to move decode_slice MB into OCRAM.

Dec 13, 2007

Currently Yuv2RGB convert

Since I think the bottomnect is decode part. I will no put more attention on YUV2RGB convert. FFMPEG yuv2rgb is too slow, so I write one, it is faster than FFMPEG.

void yuv_convert_rgb(AVPicture *dst, const AVPicture *src,
int width, int height)
const uint8_t *y1_ptr, *y2_ptr, *cb_ptr, *cr_ptr;
uint8_t *d, *d1, *d2;
int w, y, width2;
int v1,uv,u2;
int dst_linesize,src_linesize,src_uvlinesize;

d = dst->data[0];
y1_ptr = src->data[0];
cb_ptr = src->data[1];
cr_ptr = src->data[2];
width2 = (width + 1) >> 1;
dst_linesize = dst->linesize[0];
src_linesize = src->linesize[0];
src_uvlinesize = src_linesize >> 1;
for(;height >= 2; height -= 2) {
d1 = d;
d2 = d + dst_linesize;
y2_ptr = y1_ptr + src_linesize;
for(w = width; w >= 2; w -= 2) {
v1 = a1[cr_ptr[0]];
u2 = a4[cb_ptr[0]];
uv = a2[cr_ptr[0]] + a3[cb_ptr[0]];

((uint32_t *)(d1))[0] = ((((y1_ptr[0] + v1) >> 3) << 11) | (((y1_ptr[0] - uv) >> 2) << 5) | ((y1_ptr[0] + u2) >> 3))
|(((((y1_ptr[1] + v1) >> 3) << 11) | (((y1_ptr[1] - uv) >> 2) << 5) | ((y1_ptr[1] + u2) >> 3)) << 16);

((uint32_t *)(d2))[0] = ((((y2_ptr[0] + v1) >> 3) << 11) | (((y2_ptr[0] - uv) >> 2) << 5) | ((y2_ptr[0] + u2) >> 3))
|(((((y2_ptr[1] + v1) >> 3) << 11) | (((y2_ptr[1] - uv) >> 2) << 5) | ((y2_ptr[1] + u2) >> 3)) << 16);

d1 += 2 * 2;
d2 += 2 * 2;

y1_ptr += 2;
y2_ptr += 2;
d += (dst_linesize<<1);
y1_ptr += (src_linesize<<1) - width;
cb_ptr += src_uvlinesize - width2;
cr_ptr += src_uvlinesize - width2;

The time cost.

Currently, decode frame cost 37ms, yuv2rgb cost 13ms, lcd display cost 14ms. However, LCD display through DMA, the time can overlap with decode time. We need consider how to cut down decode time. I might need add profile to check which part cost time too much.

Dec 12, 2007

YUV2RGB 快速转换(转发)

swdata->pixels = (Uint8 *) malloc(width*height*2);
swdata->colortab = (int *)malloc(4*256*sizeof(int));
Cr_r_tab = &swdata->colortab[0*256];
Cr_g_tab = &swdata->colortab[1*256];
Cb_g_tab = &swdata->colortab[2*256];
Cb_b_tab = &swdata->colortab[3*256];
swdata->rgb_2_pix = (Uint32 *)malloc(3*768*sizeof(Uint32));
r_2_pix_alloc = &swdata->rgb_2_pix[0*768];
g_2_pix_alloc = &swdata->rgb_2_pix[1*768];
b_2_pix_alloc = &swdata->rgb_2_pix[2*768];
for (i=0; i<256; i++) {
/* 这里的一个表是为乘法做的一个表*/
CB = CR = (i-128);
Cr_r_tab[i] = (int) ( (0.419/0.299) * CR);
Cr_g_tab[i] = (int) (-(0.299/0.419) * CR);
Cb_g_tab[i] = (int) (-(0.114/0.331) * CB);
Cb_b_tab[i] = (int) ( (0.587/0.331) * CB);
Rmask = display->format->Rmask;
Gmask = display->format->Gmask;
Bmask = display->format->Bmask;
for ( i=0; i<256; ++i ) {
/*这个表是为饱和做的,并且已经做好了移位,到查表的时候只要将这几个rgb的值或起来即可,r被饱和到 0~0xf800之间(高5位有值),g被饱和到0~0x07e0之间(中间6位有值),b被饱和到0~0x001f之间(低5位有 值)*/
r_2_pix_alloc[i+256] = i >> (8 - number_of_bits_set(Rmask));
r_2_pix_alloc[i+256] <<= free_bits_at_bottom(Rmask);
g_2_pix_alloc[i+256] = i >> (8 - number_of_bits_set(Gmask));
g_2_pix_alloc[i+256] <<= free_bits_at_bottom(Gmask);
b_2_pix_alloc[i+256] = i >> (8 - number_of_bits_set(Bmask));
b_2_pix_alloc[i+256] <<= free_bits_at_bottom(Bmask);
cr_r = 0*768+256 + colortab[ *cr + 0*256 ];
crb_g = 1*768+256 + colortab[ *cr + 1*256 ] + colortab[ *cb + 2*256 ];
cb_b = 2*768+256 + colortab[ *cb + 3*256 ];
++cr; ++cb;
L = *lum++;/*将3个值或起来,构成rgb565,没什么好说的。*/
*row1++ = (rgb_2_pix[ L + cr_r ]|rgb_2_pix[ L + crb_g ]|rgb_2_pix[ L + cb_b ]);

