Register-allocated paging for big data calculations
Software to support the Monte Carlo method generates large vectors of pseudo-random numbers and uses these as operands in complex mathematical expressions. When such software is run on standard PC-based hardware, the volume of data involved often exceeds the physical RAM available. To address this problem, vectors must be paged out to disk and paged back in when required. This paging is often the performance bottleneck limiting the execution speed of the software. Because the mathematical expressions are specified in advance of execution, predictive solutions are possible – for instance, by treating the problem similarly to register allocation. The problem of allocating scalar variables to processor registers is a widely studied aspect of compiler implementation. A register allocation algorithm decides which variable is held in which register, when the value in a register can be overwritten, and when a value is stored in, or later retrieved from, main memory. In this paper, register allocation techniques are used to plan the paging of vectors in Monte Carlo software. Two register allocation algorithms are applied to invented vector programs written in a prototype low-level vector language and the results are compared.
Keywords: Big Data, Compiler, Monte Carlo, Paging, Performance, Register Allocation.
Duffy, D.J. and Kienitz, J. (2009) ‘Monte Carlo frameworks: building customisable high performance C++ applications’, Chichester, John Wiley & Sons.
Tian, X. and Benkrid, K. (2008) ‘Design and Implementation of a High Performance Financial Monte-Carlo Simulation Engine on an FPGA Supercomputer’, ICECE Technology International Conference on Field Programmable Technology, pp 81–88.
Nandivada, V. K. (2007) ‘Advances in Register Allocation Techniques’, the Compiler Design Handbook: Optimizations and Machine Code Generation, CRC Press.
Muchnick, S. S. (1997) ‘Advanced Compiler Design and Implementation’, San Francisco, Morgan Kaufmann Publishers, Inc.
Appel A. W. and Palsberg, J. (2002) ‘Modern Compiler Implementation in Java’, 2nd Edition, Cambridge University Press, Cambridge.
Pereira, F. M. Q. and Palsberg, J. (2005) ‘Register Allocation via Coloring of Chordal Graphs’, Lecture Notes in Computer Science, vol. 3780/2005, pp 315–329.
Chaitin, G. J. (1982) ‘Register allocation and spilling via graph coloring’, Symposium on Compiler Construction, vol. 17, no. 6, pp 98–105.
Blazy, S. and Robillard, B. (2009) ‘Live-Range Unsplitting for Faster Optimal Coalescing’, Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems, ACM, New York.
Dowd, K. and Severance, C. (1998) ‘High Performance Computing’, Sebastopol, O’Reilly & Associates, Inc.
Li, L., Gao, L. and Xue, J. (2005) ‘Memory Coloring: A Compiler Approach for Scratchpad Memory Management’, Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT’05), pp 329–338.
Yang, X., Deng, Y., Wang, L., Yan, X., Du, J., Zhang, Y., Wang, G. and Tang, T. (2009) ‘SRF Coloring: Stream Register File Allocation via Graph Coloring’, Journal of Computer Science and Technology, vol. 24, no. 1, pp 152–164.
The Open University (2001) MT365 Graphs, Networks and Design, Milton Keynes, Open University.
Briggs, P. (1992) ‘Register Allocation via Graph Coloring’, (PhD Thesis) Rice University.
Cooper, K. D., Harvey, T. J. and Torczon, L. (1998) ‘How to Build an Interference Graph’, Software – Practice and Experience, vol. 28, no. 4 (April), pp 425–444.
Galinier, P. and Hao, J.-K. (1999) ‘Hybrid Evolutionary Algorithms for Graph Coloring’, Journal of Combinatorial Optimization, vol. 3, pp 379–397, The Netherlands, Kluwer Academic Publishers.
Glass, C. A. and Prügel-Bennett, A. (2003) ‘Genetic Algorithm for Graph Coloring: Exploration of Galinier and Hao’s Algorithm’, Journal of Combinatorial Optimization, vol. 7, pp 229–236, The Netherlands, Kluwer Academic Publishers.
Mahajan, A. and Ali, M.S. (2008) ‘Hybrid Evolutionary Algorithm for graph coloring register allocation’, IEEE Congress on Evolutionary Computation, Hong Kong, pp1162–1167.
Cooper, K. D., Schielke, P. J. and Subramanian, D. (1999) ‘Optimizing for Reduced Code Space using Genetic Algorithms’, Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers and tools for embedded systems, New York, ACM.
George, L. and Appel, A. (1996) ‘Iterated register coalescing’, ACM Transactions on Programming Languages and Systems, vol. 18, no. 3, May 1996, pp 300–324, New York, ACM.
Park, J. and Moon, S.-M. (2007) ‘Optimistic Register Coalescing’, ACM Transactions on Programming Languages and Systems, vol. 26, no. 4, July 2004, pp 735–765, New York, ACM.
Odaira, R., Nakaike, T., Inagaki, T., Komatsu, H. and Nakatani, T. (2010) ‘Coloring-based Coalescing for Graph Coloring Register Allocation’, CGO ’10 Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization.
Koes, D. R. and Goldstein, S.C. (2009) ‘Register Allocation Deconstructed’, 12th International Workshop on Software & Compilers for Embedded Systems, ACM, New York.
Bouchez, F., Colombet, Q., Darte, A., Rastello, F. and Guillon, C. (2010) ‘Parallel Copy Motion’, SCOPES’10, June 2010, ACM.
Smith, M. D., Ramsey, N. and Holloway, G. (2004) ‘A Generalized Algorithm for Graph-Coloring Register Allocation’, Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation, vol. 39, issue 6 (May), New York, ACM.
Ahn, M. and Paek, Y. (2009) ‘Register Coalescing Techniques for Heterogeneous Register Architecture with Copy Sifting’, ACM Transactions on Embedded Computing Systems, vol. 8, no. 2, January 2009, New York, ACM.
Appel, A. W. and George, L. (2001) ‘Optimal spilling for CISC machines with few registers', PLDI'01, pp243–253.
Hack, S., Grund, D. and Goos, G (2006) ‘Register allocation for programs in SSA form’, Lecture Notes in Computer Science, 2006, vol. 3923/2006, pp 247–262.
Barik, R., Grothoff, C., Gupta, R., Pandit, V. and Udupa, R. (2007) ‘Optimal Bitwise Register Allocation Using Integer Linear Programming’, Lecture Notes in Computer Science vol. 4382, pp 267–282, Berlin Heidelberg, Springer-Verlag.
Poletto, M. and Sarkar, V. (1999) ‘Linear scan register allocation’, ACM Transactions on Programming Languages and Systems, vol. 21, no. 6.
Mössenböck, H. and Pfeiffer, M. (2002) ‘Linear Scan Register Allocation in the Context of SSA Form and Register Constraints’, Lecture Notes in Computer Science, vol. 2304/2002, pp 229–246, Berlin Heidelberg, Springer-Verlag.
Wimmer, C. and Franz, M. (2010) ‘Linear scan register allocation on SSA forms’, CGO ’10 Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization.
Sarkar, V. and Barik, R. (2007) ‘Extended Linear Scan: An Alternate Foundation for Global Register Allocation’, Compiler Construction, Lecture Notes in Computer Science, vol. 4420, pp 141–155, Berlin Heidelberg, Springer-Verlag.
Pereira, F. M. Q. and Palsberg, J. (2008) ‘Register allocation by puzzle solving’, PLDI ’08 Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation, New York, ACM.
Pereira, F. M. Q. and Palsberg, J. (2010) ‘Punctual Coalescing’, Lecture Notes in Computer Science, vol. 6011/2010, pp 165–184, Berlin Heidelberg, Springer-Verlag.
George, L. and Appel, A. (1995) ‘Iterated register coalescing’, Technical Report CS-TR-498-95, Princeton University [http://www.cs.princeton.edu/research/techreps/TR-498-95 accessed 31-Dec-2011].