Création d un HAL pour processeur Cell Annexes Filip Novotny 20 mai 2009 Table des matières Encadré par Paul Amblard 1 Annexe 1 : Notions d assembleur PowerPC64 2 2 Annexe 2 : Spécifications du HAL 3 2.1 Platform Abstraction Layer..................................... 3 2.2 Endianness management....................................... 3 2.3 Multiprocessor management..................................... 3 2.4 CPU Abstraction Layer....................................... 3 2.5 Endianness management....................................... 3 2.6 Execution context management................................... 4 2.7 I/O management........................................... 5 2.8 Interrupts management....................................... 5 2.9 Exception management....................................... 6 2.10 Multiprocessor management..................................... 6 2.11 Synchronization management.................................... 7 3 Annexe 3 : Code du ping-pong multi-spe 8 1
1 Annexe 1 : Notions d assembleur PowerPC64 Ceci introduit brièvement les instructions assembleur PowerPC mentionnées dans ce TER. Load Immediate : li reg val Affecte la valeur val dans le registre reg. Exemple : li r0 1 STore Byte/Word/Dword : st{d,w,b} reg adresse Écrit le premier octet/mot/double mot du contenu de reg à l adresse adresse. L adresse doit etre de la forme déplacement(registre) Exemple : std r0 0(r1) #écrit un double mot à l adresse contenue dans r1 stw r0 0(r1) #écrit un mot à l adresse contenue dans r1 stb r0 0(r1) #écrit un octet à l adresse contenue dans r1 Load Byte/Word/Dword : l{d,w,b} reg adresse Charge le premier octet/mot/double mot de l adresse adresse dans le registre reg. L adresse doit etre de la forme déplacement(registre) Exemple : ld r0 0(r1) #charge un double mot depuis l adresse contenue dans r1 lw r0 0(r1) #charge un mot depuis l adresse contenue dans r1 lb r0 0(r1) #charge un octet depuis l adresse contenue dans r1 Move Register : mr reg1 reg2 Écrit reg2 dans reg1 Exemple : mr r1 r2 #copie r2 dans r1 Branch and Link : bl adresse Appelle la fonction à l adresse adresse. L adresse est donnée sous forme hexadécimale. Cette instruction sauvegarde l adresse de retour (ie l emplacement actuel) dans le registre lr. Exemple : bl aabbccdd #appel de la fonction dont le point d entrée est en 0xAABBCCDD Branch Link Register : blr Continue l exécution à l adresse contenue dans lr. S utilise pour retourner d une fonction. 2
2 Annexe 2 : Spécifications du HAL 2.1 Platform Abstraction Layer This API is composed of four parts : endianness management and multiprocessor management. Its implementations are located in the pal directory of the SSP root path. 2.2 Endianness management This section presents the elements that allow to deal with endianness at the platform level. In order to guarantee the coherency of shared data between two heterogeneous processors of a same hardware platform, an endianness property is associated to the platform. It corresponds to the official structure of the shared data of the platform. definition PLATFORM_IS_LITTLE_ENDIAN This definition specifies that all shared data is of type little endian. definition PLATFORM_IS_BIG_ENDIAN This definition specifies that all shared data is of type big endian. Note that only one of the above definition can be used at a time. 2.3 Multiprocessor management This section presents the element that allow to deal with multiprocessor systems. function PLATFORM_MP_CPU_COUNT return int32 type: in enumerate - The processor s type This function returns the number of processors for a specified type. 2.4 CPU Abstraction Layer This API is composed of 6 parts : endianness, context management, I/O management, Traps management, multiprocessor management and software spinlocks management. Its implementations are located in the cal directory of SSP s root directory. 2.5 Endianness management This section presents the elements that allow to deal with endianness at a processor level. procedure CPU_ENDIAN_IS_BIG_{16,32,64} value: inout Integer{16,32,64} Depending on the processor s endianness, this procedure reorders the bytes of the {16,32,64} bits word value in a big-endian fashion. procedure CPU_ENDIAN_IS_LITTLE_{16,32,64} value: inout Integer{16,32,64} Depending on the processor s endianness, this procedure reorders the bytes of the {16,32,64} bits word value in a little-endian fashion. 3
procedure CPU_ENDIAN_CONCAT result: out Integer{16,32,64} low: in Integer{8,16,32} high: in Integer{8,16,32} This procedure concatenates two {8,16,32} bits elements into one {16,32,64} bits elements that matches the processor s endianness. procedure CPU_ENDIAN_SPLIT value: in Integer{16,32,64} low: out Integer{8,16,32} high: out Integer{8,16,32} This procedure splits one {16,32,64} bits element matching the processor s endianness into two {8,16,32} bits elements. 2.6 Execution context management This section presents the elements that allow to deal with execution contexts. type CPU_CONTEXT_T This type represents the execution context of the processor. definition CPU_CONTEXT_SIZE This definition defines the context s size of the processor. procedure CPU_CONTEXT_INIT context: out access CPU_CONTEXT_T stack: in access array of Integer8 - The stack s size entry: in access procedure arguments: in access record This procedure initializes an execution context with a stack of a specific size, a entry point and some arguments that will be passed to the entry point. Note that the stack must have been allocated. procedure CPU_CONTEXT_LOAD context: in access CPU_CONTEXT_T This procedure loads a specific context. procedure CPU_CONTEXT_SAVE context: out access CPU_CONTEXT_T This procedure saves the current execution context into context. procedure CPU_CONTEXT_SWITCH from: out access CPU_CONTEXT_T to: in access CPU_CONTEXT_T 4
This procedure saves the current execution context into from and loads the context to. 2.7 I/O management This section presents the elements that allow to deal with I/Os. procedure CPU_READ address: in access Integer{8,16,32,64} result: out Integer{8,16,32,64} This procedure reads a {8,16,32,64} bits integer from address to result. procedure CPU_READ_UNCACHED address: in access Integer{8,16,32,64} result: out Integer{8,16,32,64} This procedure reads a {8,16,32,64} bits integer from address to result in a non-cached fashion. procedure CPU_READ_VECTOR mode: in VectorMode - Integer 8, 16, 32, 64 ; Float ; Double from: in access <VectorMode> to: out access <VectorMode> This procedure reads a vector of size, from and writes it to. This operation is executed in a specific mode. procedure CPU_WRITE address: in access Integer{8, 16,32,64} value: out Integer{8, 16,32,64} This procedure writes a {8, 16,32,64} bits integer from value to address. procedure CPU_WRITE_UNCACHED address: in access Integer{8, 16,32,64} value: out Integer{8, 16,32,64} This procedure reads a {8, 16,32,64} integer from value to address in a non-cached fashion. procedure CPU_WRITE_VECTOR mode: in VectorMode - Integer 8, 16, 32, 64 ; Float ; Double to: in access <VectorMode> from: out access <VectorMode> This procedure writes a vector of size, to and reads it from. This operation is executed in a specific mode. 2.8 Interrupts management This section presents the elements that allow to deal with interrupts. 5
type interrupt_id_t This type represents an interrupt line of a processor or of a processor s subsystem. type interrupt_status_t This type represents the interrupt status of a processor or of a processor s subsystem. type interrupt_handler_t This type represents the handler of a processor interrupt. procedure CPU_IT_ENABLE vector: in interrupt_id_t This procedure masks the interruption indexed by vector. procedure CPU_IT_DISABLE vector: in interrupt_id_t This procedure unmasks the interruption indexed by vector. procedure CPU_IT_ATTACH_ISR vector: in interrupt_id_t handler: in interrupt_handler_t This procedure attaches an interrupt handler to a specific interrupt vector. 2.9 Exception management This section presents the elements that allow to deal with processor exceptions. type exception_id_t This type represents an exception of a processor. type exception_handler_t This type represents the handler of a processor exception. procedure CPU_IT_ATTACH_ESR vector: in exception_id_t handler: in exception_handler_t This procedure attaches an exception handler to a specific exception vector. 2.10 Multiprocessor management This section presents the elements that allow to deal multiprocessor configurations. function CPU_MP_COUNT return Integer32 This function returns the number of CPUs identical to the one calling the function. 6
function CPU_MP_ID return Integer32 This function returns the unique processor s identifier. procedure CPU_MP_WAIT sync: in access Integer32 This procedure spins until the value of the variable sync turns 0. This variable must exist and must be initialized to 1 beforehand. procedure CPU_MP_PROCEED sync: in access Integer32 This procedure sets the variable sync to 0. 2.11 Synchronization management This section presents the elements that allow to deal with synchronization. function CPU_TEST_AND_SET return Integer32 lock: in access Integer32 This procedure performs a test-and-set operation on the given lock. function CPU_COMPARE_AND_SWAP return Integer32 lock: inout Integer32 old: in Integer32 new: in Integer32 This procedure performs a compare-and-swap on the specified lock, using the two old and new values as reference. 7
3 Annexe 3 : Code du ping-pong multi-spe Dans ce code, la fonction printf a été écrite en utilisant CPU WRITE. Côté PPE : while(1){ printf("ping"); /* Transfert 16 octets minimum car c est la taille minimale supportée par le DMA coté SPE */ for(i=0;i<8;i++) CPU_WRITE(8*16,0x5000+i*0x100,"ping"); } for(i=0;i<8;i++){ val=0; CPU_READ(8*16, 0x5000+i*0x100,&val); if(!memcmp(&val,"pong",4)) printf("pong from spu %d\n",i); } Puis côté SPE : #define ME 0 //SPE numéro 0 while(1){ //lecture à l adresse partagée par DMA mfc_get(buffer,0x5000+me*0x100,16,tag,tid,rid); mfc_read_tag_status_all(); //attend que le DMA finisse if(!memcmp(buffer,"ping",4)){ mfc_put("pong"); mfc_read_tag_status_all(); } } 8