Форма обучения:
дистанционная
Стоимость самостоятельного обучения:
бесплатно
Доступ:
свободный
Документ об окончании:
 
Уровень:
Специалист
Длительность:
14:25:00
Студентов:
355
Выпускников:
24
Качество курса:
4.00 | 4.20
The course concentrates mostly on application performance improvements with Intel Compiler and VTune Amplifier.
It briefly describes microprocessor architecture; application performance factors and common speedup techniques: scalar optimizations, loop optimizations, vectorization, parallelization, interprocedural optimizations and profiler guide optimizations. The course describes compiler architecture and command line options, compiler limitations and methods of providing additional information to the compiler. It gives a first insight to the performance analysis. Practical examples help to become familiar with VTune usage and the ideas of performance optimization.
Специальности: Программист
 

План занятий

Занятие
Заголовок <<
Дата изучения
Лекция 1
54 минуты
Introduction to application optimizations with usage of Intel® performance tools
At the first lecture common Intel microprocessor architecture and the main factors affecting its performance are described. The simplified microprocessor model is used to show the subsystems role and describe the main features such as multi-level memory model, common and vector registers, data prefetching mechanism, branch prediction, pipeline and superscalar features, vector instructions, multi-core, multi-processor. Performance optimization compiler role is also described.
Оглавление
    -
    Лекция 2
    17 минут
    Intel® performance analyze tools
    Second lecture briefly describes important performance tool VTune Amplifier and describes the main ideas of its usage; the common scheme of performance tuning; VTune graphical interface; the main analysis techniques and their implementation at VTune.
    Оглавление
      -
      Лекция 3
      42 минуты
      Optimizing compiler Scalar optimizations
      At this lecture Intel Compilers approximate compilation scheme is given. Role of the frontend and the internal representation of the compiler. Control flow graph and its importance for the analysis. The basic scalar optimizations. The notion of the static single assignment form.
      Оглавление
        -
        Тест 3
        42 минуты
        -
        Лекция 4
        43 минуты
        Optimizing compiler. Loop optimizations
        This lecture is devoted to the loops and loop optimizations. Discussed topics are: the problem of classification and loop recognition; the reasons for loop optimization precise study; common loop optimizations; optimization examples; the notion of dependency and permutation; optimization admissibility criterion in terms of dependencies; compiler command line options.
        Оглавление
          -
          Лекция 5
          1 час 16 минут
          Optimizing compiler. Vectorization
          The lecture reviews basic principles of the vectorization. Discussed topics are: the problems of an automatic vectorization; data alignment and kind of memory access influence; the compiler options associated with autovectorizer; preprocessor directives and language constructions related to vectorization; vectorization profitability criterion.
          Оглавление
            -
            Лекция 6
            59 минут
            Optimizing compiler. Auto parallelization
            The lecture describes main features of the multiprocessor and multicore computing systems, pros and cons of the multithreaded applications. Auto parallelization as the simple method for multi-threaded application creation. Compiler command line options. Some of the language extensions used for parallelization Manual and automatic prefetching.
            Оглавление
              -
              Тест 6
              33 минуты
              -
              Лекция 7
              14 минут
              OpenMP fundamentials
              In this lecture describes OpenMP basics. OpenMP workflow and its limitation. Different variables and sections of the code, critical and concurrent sections, loop parallelization directives, loop iteration scheduling.
              Оглавление
                -
                Лекция 8
                1 час 2 минуты
                Optimizing compiler. Interpocedural optimizations
                The lecture is about the interprocedural optimizations and analysis. Reasons for the entire program analysis, whole program analysis limitations and simplifications. Mod/ref analysis, local point to analysis, propagation of function and variables attributes. Compiler command line options for interprocedural analysis/optimizations control.
                Оглавление
                  -
                  Лекция 9
                  51 минута
                  Optimizing compiler. Static and dynamic profiler. Memory manager. Code generator
                  This lecture describes the roles of the profiler, memory manager and code generator. Principles of dynamic profiler usage. The difference between static and dynamic profiling. Memory distribution and allocation issues. Memory manager affection on application performance. Register allocation and instruction scheduling.
                  Оглавление
                    -
                    Тест 9
                    24 минуты
                    -
                    1 час 40 минут
                    -