Главная /
Introduction to performance optimization using Intel SW tools
Introduction to performance optimization using Intel SW tools - ответы на тесты Интуит
The course concentrates mostly on application performance improvements with Intel Compiler and VTune Amplifier.
Список вопросов:
- # The Control Unit functions are
- # What is the goals of ALU
- # Registers are
- # System bus used for
- # What is CPU speed?
- # x86 speed factors are
- # CPU timer speed is
- # Choose the correct statement
- # Memory, which is directly accessed by processor is
- # What of the following will not cause any change in processor performance?
- # Modern Intel processors are
- # Why register access latency is lower than RAM?
- # Superscalar is
- # Choose the wrong statement
- # Superscalarity is
- # What is used to send data between the processor and the memory or between the processor and the devices?
- # Time latency (for RAM) is
- # Superscalarity is
- # The ability to perform multiple operations at a tick is
- # Bandwidth is
- # Superscalar is
- # Hardware prefetching used for
- # Pipeline is
- # In a fully-associative memory
- # In out-of-order execution instructions scheduled according
- # Vectorization is parallelization technique when
- # Cache levels differ by
- # Number of ticks, required to transfer one unit from the memory is
- # Number of units could be sent to the processor at once is
- # Type of cache, where any memory block could be loaded into any part of the cache
- # What is VTune™ Performance Analyzer for?
- # What kind of information is obtainable via VTune?
- # What is the requirements of VTune?
- # VTune supports:
- # What operation system VTune supports?
- # What abilities does VTune has?
- # What analysis types are included in VTune?
- # What are functions of the Hotspots?
- # What is profiling?
- # What are locks and waits for?
- # What event corresponds processor clock ticks?
- # What event corresponds wrong branch prediction?
- # What may be cause of ineffective resource utilization?
- # What is critical code?
- # What conditions can prevent vectorization?
- # What compilers Intel® provides?
- # What platforms are supported by Intel compilers?
- # What is the functions of the compiler Front End ?
- # Internal representation is
- # Expression is
- # Choose scalar optimizations
- # Data flow analysis is
- # Set Uses[b] contains:
- # To know what variables could be used inside the block, it is necessary to estimate:
- # SSA-form is
- # Statement M dominates N if
- # Dominance frontier is
- # May one compiler have two different Front Ends?
- # To convert a compiler to different internal representation it is necessary to correct
- # What part of the compiler depends on a language most?
- # What is the part of the syntax analysis in the compiler?
- # What is the part of the syntax analysis in the compiler?
- # What is input data for syntax analysis?
- # What criteria of connecting statements into a list inside Intel compiler
- # Statements could be arranged
- # How the statements are connected inside the Intel compiler
- # Basic blocks are
- # Basic blocks are contained by
- # Basic blocks are
- # Choose the correct statements
- # Choose the correct statements
- # Nodes of control flow graph are
- # Control flow graph
- # Basic block is
- # Def-use graph nodes are
- # Tree of expressions is
- # Leafs in expressions tree
- # Constants in expressions tree
- # Operations in a expressions tree
- # The advantages of SSA form:
- # SSA-form is:
- # SSA is
- # Choose the scalar optimization:
- # "Dead code" may be caused by
- # Which of the following is required to keep the equation equivalence
- # Dependency is
- # The dependency between S1 and S2 persist if
- # Transforming optimization keeps the equation equivalence if
- # What is the Loop Stream Detector for?
- # What is required for most of the loop optimizations
- # Choose code fragments which are good for optimizing
- # Loop invariant code motion
- # Why performance is improved when invariant is moved out of the loop?
- # What is loop invariant?
- # What optimization is inverse for loop fusion?
- # Why performance could be increased after the loop distribution?
- # What could be the reason for loosing performance while processing a big loop?
- # What is loop peeling?
- # Choose the code resulting to the loop peeling for: p = 10; for (i=0; i<10; ++i) { y[i] = x[i] + x[p]; p = i; }
- # What is loop unrolling for?
- # How loop unrolling is provided?
- # When full loop unrolling is applicable?
- # Choose the correct statements for this code: S1 PI = 3.14 S2 R = 5 S3 AREA = PI*R **2
- # Choose the correct statements for the code: DO I=1,N S1 A(I) = B(I) + 1 S2 B(I+1) = A(I) – 5 END DO
- # Required condition for dependency between S1 and S2 are the following:
- # What are normalized loops?
- # When the dependency <S1,S2> is anti-dependence?
- # When the dependence <S1,S2> is output-dependence
- # When the dependence <S1,S2> is true dependence
- # What is iteration vector?
- # What is required for loop dependency between S1 и S2 in a nested set?
- # Normalized loops are used to?
- # What is constant folding?
- # Loop optimizations are:
- # Is there any dependence in this code? DO I=1,N S1 A(I+1) =F(I) S2 F(I+1) = A(I) END DO
- # Is there any dependence in this code? DO I=1,N S1 A(I)=… S2 …=A(I) END DO
- # What is FLOW dependency?
- # What is OUTPUT dependency?
- # What is ANTI dependency?
- # Loop vectorization is
- # MMX technology provides:
- # SSE is:
- # SIMD is:
- # Which of the following command line options will build a binary for any processor?
- # What is condition for vectorization?
- # What is vector instruction for the compiler?
- # What of the following is required to execute vector operation?
- # May four different variables became components of the same vector after the vectorization?
- # What size do xmm registers have?
- # What size do ymm registers have?
- # How many xmm registers does emm64t support?
- # What is packed data type?
- # What is happened to zero bits in packed data type?
- # Packed data type operations are
- # What is /Qvec-report used for?
- # What is __alignof__ used for?
- # Why it is recommended to arrange fields in structure by decrease of their size?
- # Vectorization is
- # What is the processor core?
- # What seriously limit modern system performance?
- # What types multiprocessor systems could be divided into?
- # Choose the characteristic corresponding to distributed memory systems:
- # Choose the characteristic corresponding to shared memory systems:
- # Choose the characteristic corresponding to non-uniform memory access systems:
- # What qualities does distributed memory systems have?
- # What qualities does shared memory systems have?
- # What qualities does non-uniform memory access systems have?
- # What disadvantages does distributed memory systems have?
- # What disadvantages does shared memory systems have?
- # What disadvantages does non-uniform memory access systems have?
- # What are multi-threading applications pros?
- # What are multi-threading applications cons?
- # What is an automatic parallelization propose?
- # What information does /Qpar-report3 output?
- # What kind of optimization is the auto-parallelization?
- # What are necessary conditions for auto-parallelization?
- # Is it hard to measure optimization profitability?
- # What directive suggest the compiler to not parallelize following loop?
- # What directive will force compiler to parallelize following loop?
- # What directive will force compiler to parallelize following loop if it is safe?
- # What parallel library does Intel compiler use?
- # What is OpenMP?
- # What is passed as an argument to loop parallelizing function in Intel compiler?
- # What is loop parallelizing function in Intel compiler?
- # How parallelization in Intel compiler is implemented?
- # How auto-parallelization is connected with other optimizations in Intel compiler?
- # What is "prefetch"?
- # How prefetch can be invoked?
- # What cons does prefetch has?
- # OpenMP is:
- # For parallelization it is required to:
- # When using OpenMP variables behave as follows:
- # OpenMP uses the following model of parallel execution:
- # What pragma is used to parallelize loop:
- # What identifier is not reserved for OpenMP:
- # What could be performed to save the last state of the variable into master thread after the parallel block?
- # As a default all variables except local function variables and loop iterators are add to
- # Schedule clause accepts the following arguments:
- # nowait directive is used for:
- # What directive is used to create synchronization point?
- # How many threads could enter the critical section at a time?
- # What directive is used to avoid incorrect concurrent usage of the lval variable?
- # What directive is used to mark a piece of code to be executed by master thread only?
- # What directive marks sequential execution block?
- # What option used to determine multi-thread iteration distribution?
- # What of the following is schedule type?
- # What of the following is schedule type?
- # What of the following could be considered as a good style of programming?
- # What of the following could be considered as a bad style of programming?
- # How global variables usage affects?
- # What is variable scope?
- # What benefits would give correct code formatting?
- # What is the aim when the program is divided into functions and procedures?
- # What are disadvantages of the procedural-level optimizations?
- # What are disadvantages of the procedural-level optimizations?
- # What are disadvantages of the procedural-level optimizations?
- # What is node in call graph?
- # What information is corresponding to vertexes in a call graph?
- # Why call graph may be considered as not full?
- # Static call graph is
- # Dynamic call graph is
- # Dynamical call graph
- # This command line parameter is used to enable inter-file optimization
- # This command line parameter is used to disable interprocedural optimizations
- # What kind of interprocedural optimization is used by default?
- # What is alias analysis?
- # Aliasing could be occurred between
- # Points to analysis is
- # What is inlining?
- # What is the goals of inlining?
- # What are disadvantages of inlining?
- # What is memory diambiguation?
- # What is taken into account during the memory disambiguation?
- # When permutation transformations are not allowed?
- # What is - ansi-alias for?
- # What is demanded by ANSI aliasing?
- # What is the meaning of restrict attribute at pointer definition in С/С++?
- # What is __declspec(align(n)) pragma used for?
- # How compiler determines a case when it is better to perform inlining?
- # How developer could drive inlining process?
- # During the VTune analysis some of the functions is missed. Why could it happened?
- # What is used to suggest function for inline?
- # What could be used to force function inline?
- # What option is used to disable inline?
- # What is function cloning?
- # What is partial inlining?
- # What interprocedural optimization is specific to C++?
- # What disadvantage does static profiler has?
- # What is the source for branch prediction in static profiler
- # Static profiler used
- # Dynamic profiler benefits are
- # What is required for dynamic profiling
- # Dynamic profiler differs from static
- # Dynamic data is useful when
- # Dynamic memory allocation is bad for
- # Dynamic memory allocation
- # Choose the correct statement(s)
- # Choose the correct statement(s)
- # Choose the correct statement(s)
- # Register allocation includes
- # Interference graph is built
- # What is corresponding entity for the interference graph colors?
- # How data dependencies are used in the code generation?
- # What instruction scheduling is useful for?
- # How does instruction planning performed?
- # How structure field reordering could affect the application performance?
- # What are the aims for the structure splitting?
- # Why is the pointer chasing useful?
- # When the linked list is stored inside the memory
- # How dynamic linked list memory placement can be improved?
- # Linked list worse than array for
- # Array is better than linked list for