In the past one week I focus on energy-efficient network
processor design. Particularly, I focus on the precise DVS 
for low power network processor. The existing online DVS for network 
processor simply changes Vdd to a predefined level every time 
the scaling is enforced, for example, reduce Vdd by 0.05V 
everytime we need to lower Vdd. However, with our performance 
model and SA method we can know exactly what is the optimal 
Vdd we need to set to each PE. Therefore such precise DVS 
must perform better than existing heuristic DVS. This one is 
the nature extension of our previous work. The todo list
include:

1. based on call level statistics, generate the incoming 
packet rate for one second. This has been done individually.
2. based on the packet rate, decide the power-optimal Vdd 
assignment and frequency assignment. This has also been done 
individually.
3. run simulation with this frequency assignment to verify 
the QoS.
4. cumulate the power in this second, and repeat the
procedure from 1 to 3.

We can compare the power results to the heuristic DVS and 
DVS without bus impact.

The problem of our approach is that it is too time-consuming
to obtained an optimal Vdd setting by SA, if such method is
to be used for online DVS. Offline DVS is another story.
I am reading a few papers to gain acknoledge on offline DVS.
My opinion is that we can first target at offline DVS, assuming 
the application can be profiled ahead of the running time and 
all optimal Vdd setting are stored in a table. 

Another topic we may explore is the dynamic scheduling for 
energy minimization with simultenous QoS and thermal guarantee. 
Here the thermal guarantee means the maximum on-chip temperature 
constraint. The basic idea is that, according to the iso-QoS and 
iso-temperature curves, for given QoS and thermal constraint, 
any combination of (Vdd, PE number) corresponding to a point 
in one out of 4 regions partitioned by the iso-QoS and iso-temperature 
curves: 

Region I: the (Vdd, PE number) combination leads to high 
temperature and excessive performance. In this region the scheduling 
is to reduce Vdd (assuming all PE has the same Vdd), but not turn 
off existing PE. 

Region II: high temperature but not enough performance, in this region 
the scheduling is to increase PE number Region.

III: low temperature but not enough performance, in this region the 
scheduling is to increase Vdd but do not turn on new PE. 

Region IV: low temperature but excessive performance, in this region 
the scheduling is to reduce PE number.

The results can be compared to dynamic scheduling with only either 
QoS or thermal constraint.  

By combining the two topics, we can send out a journal submission.