Главная / Introduction to performance optimization using Intel SW tools

Introduction to performance optimization using Intel SW tools - ответы на тесты Интуит

The course concentrates mostly on application performance improvements with Intel Compiler and VTune Amplifier.

Список вопросов:

# The Control Unit functions are
# What is the goals of ALU
# Registers are
# System bus used for
# What is CPU speed?
# x86 speed factors are
# CPU timer speed is
# Choose the correct statement
# Memory, which is directly accessed by processor is
# What of the following will not cause any change in processor performance?
# Modern Intel processors are
# Why register access latency is lower than RAM?
# Superscalar is
# Choose the wrong statement
# Superscalarity is
# What is used to send data between the processor and the memory or between the processor and the devices?
# Time latency (for RAM) is
# Superscalarity is
# The ability to perform multiple operations at a tick is
# Bandwidth is
# Superscalar is
# Hardware prefetching used for
# Pipeline is
# In a fully-associative memory
# In out-of-order execution instructions scheduled according
# Vectorization is parallelization technique when
# Cache levels differ by
# Number of ticks, required to transfer one unit from the memory is
# Number of units could be sent to the processor at once is
# Type of cache, where any memory block could be loaded into any part of the cache
# What is VTune™ Performance Analyzer for?
# What kind of information is obtainable via VTune?
# What is the requirements of VTune?
# VTune supports:
# What operation system VTune supports?
# What abilities does VTune has?
# What analysis types are included in VTune?
# What are functions of the Hotspots?
# What is profiling?
# What are locks and waits for?
# What event corresponds processor clock ticks?
# What event corresponds wrong branch prediction?
# What may be cause of ineffective resource utilization?
# What is critical code?
# What conditions can prevent vectorization?
# What compilers Intel® provides?
# What platforms are supported by Intel compilers?
# What is the functions of the compiler Front End ?
# Internal representation is
# Expression is
# Choose scalar optimizations
# Data flow analysis is
# Set Uses[b] contains:
# To know what variables could be used inside the block, it is necessary to estimate:
# SSA-form is
# Statement M dominates N if
# Dominance frontier is
# May one compiler have two different Front Ends?
# To convert a compiler to different internal representation it is necessary to correct
# What part of the compiler depends on a language most?
# What is the part of the syntax analysis in the compiler?
# What is the part of the syntax analysis in the compiler?
# What is input data for syntax analysis?
# What criteria of connecting statements into a list inside Intel compiler
# Statements could be arranged
# How the statements are connected inside the Intel compiler
# Basic blocks are
# Basic blocks are contained by
# Basic blocks are
# Choose the correct statements
# Choose the correct statements
# Nodes of control flow graph are
# Control flow graph
# Basic block is
# Def-use graph nodes are
# Tree of expressions is
# Leafs in expressions tree
# Constants in expressions tree
# Operations in a expressions tree
# The advantages of SSA form:
# SSA-form is:
# SSA is
# Choose the scalar optimization:
# "Dead code" may be caused by
# Which of the following is required to keep the equation equivalence
# Dependency is
# The dependency between S1 and S2 persist if
# Transforming optimization keeps the equation equivalence if
# What is the Loop Stream Detector for?
# What is required for most of the loop optimizations
# Choose code fragments which are good for optimizing
# Loop invariant code motion
# Why performance is improved when invariant is moved out of the loop?
# What is loop invariant?
# What optimization is inverse for loop fusion?
# Why performance could be increased after the loop distribution?
# What could be the reason for loosing performance while processing a big loop?
# What is loop peeling?
# Choose the code resulting to the loop peeling for: p = 10; for (i=0; i<10; ++i) { y[i] = x[i] + x[p]; p = i; }
# What is loop unrolling for?
# How loop unrolling is provided?
# When full loop unrolling is applicable?
# Choose the correct statements for this code: S1 PI = 3.14 S2 R = 5 S3 AREA = PI*R **2
# Choose the correct statements for the code: DO I=1,N S1 A(I) = B(I) + 1 S2 B(I+1) = A(I) – 5 END DO
# Required condition for dependency between S1 and S2 are the following:
# What are normalized loops?
# When the dependency <S1,S2> is anti-dependence?
# When the dependence <S1,S2> is output-dependence
# When the dependence <S1,S2> is true dependence
# What is iteration vector?
# What is required for loop dependency between S1 и S2 in a nested set?
# Normalized loops are used to?
# What is constant folding?
# Loop optimizations are:
# Is there any dependence in this code? DO I=1,N S1 A(I+1) =F(I) S2 F(I+1) = A(I) END DO
# Is there any dependence in this code? DO I=1,N S1 A(I)=… S2 …=A(I) END DO
# What is FLOW dependency?
# What is OUTPUT dependency?
# What is ANTI dependency?
# Loop vectorization is
# MMX technology provides:
# SSE is:
# SIMD is:
# Which of the following command line options will build a binary for any processor?
# What is condition for vectorization?
# What is vector instruction for the compiler?
# What of the following is required to execute vector operation?
# May four different variables became components of the same vector after the vectorization?
# What size do xmm registers have?
# What size do ymm registers have?
# How many xmm registers does emm64t support?
# What is packed data type?
# What is happened to zero bits in packed data type?
# Packed data type operations are
# What is /Qvec-report used for?
# What is __alignof__ used for?
# Why it is recommended to arrange fields in structure by decrease of their size?
# Vectorization is
# What is the processor core?
# What seriously limit modern system performance?
# What types multiprocessor systems could be divided into?
# Choose the characteristic corresponding to distributed memory systems:
# Choose the characteristic corresponding to shared memory systems:
# Choose the characteristic corresponding to non-uniform memory access systems:
# What qualities does distributed memory systems have?
# What qualities does shared memory systems have?
# What qualities does non-uniform memory access systems have?
# What disadvantages does distributed memory systems have?
# What disadvantages does shared memory systems have?
# What disadvantages does non-uniform memory access systems have?
# What are multi-threading applications pros?
# What are multi-threading applications cons?
# What is an automatic parallelization propose?
# What information does /Qpar-report3 output?
# What kind of optimization is the auto-parallelization?
# What are necessary conditions for auto-parallelization?
# Is it hard to measure optimization profitability?
# What directive suggest the compiler to not parallelize following loop?
# What directive will force compiler to parallelize following loop?
# What directive will force compiler to parallelize following loop if it is safe?
# What parallel library does Intel compiler use?
# What is OpenMP?
# What is passed as an argument to loop parallelizing function in Intel compiler?
# What is loop parallelizing function in Intel compiler?
# How parallelization in Intel compiler is implemented?
# How auto-parallelization is connected with other optimizations in Intel compiler?
# What is "prefetch"?
# How prefetch can be invoked?
# What cons does prefetch has?
# OpenMP is:
# For parallelization it is required to:
# When using OpenMP variables behave as follows:
# OpenMP uses the following model of parallel execution:
# What pragma is used to parallelize loop:
# What identifier is not reserved for OpenMP:
# What could be performed to save the last state of the variable into master thread after the parallel block?
# As a default all variables except local function variables and loop iterators are add to
# Schedule clause accepts the following arguments:
# nowait directive is used for:
# What directive is used to create synchronization point?
# How many threads could enter the critical section at a time?
# What directive is used to avoid incorrect concurrent usage of the lval variable?
# What directive is used to mark a piece of code to be executed by master thread only?
# What directive marks sequential execution block?
# What option used to determine multi-thread iteration distribution?
# What of the following is schedule type?
# What of the following is schedule type?
# What of the following could be considered as a good style of programming?
# What of the following could be considered as a bad style of programming?
# How global variables usage affects?
# What is variable scope?
# What benefits would give correct code formatting?
# What is the aim when the program is divided into functions and procedures?
# What are disadvantages of the procedural-level optimizations?
# What are disadvantages of the procedural-level optimizations?
# What are disadvantages of the procedural-level optimizations?
# What is node in call graph?
# What information is corresponding to vertexes in a call graph?
# Why call graph may be considered as not full?
# Static call graph is
# Dynamic call graph is
# Dynamical call graph
# This command line parameter is used to enable inter-file optimization
# This command line parameter is used to disable interprocedural optimizations
# What kind of interprocedural optimization is used by default?
# What is alias analysis?
# Aliasing could be occurred between
# Points to analysis is
# What is inlining?
# What is the goals of inlining?
# What are disadvantages of inlining?
# What is memory diambiguation?
# What is taken into account during the memory disambiguation?
# When permutation transformations are not allowed?
# What is - ansi-alias for?
# What is demanded by ANSI aliasing?
# What is the meaning of restrict attribute at pointer definition in С/С++?
# What is __declspec(align(n)) pragma used for?
# How compiler determines a case when it is better to perform inlining?
# How developer could drive inlining process?
# During the VTune analysis some of the functions is missed. Why could it happened?
# What is used to suggest function for inline?
# What could be used to force function inline?
# What option is used to disable inline?
# What is function cloning?
# What is partial inlining?
# What interprocedural optimization is specific to C++?
# What disadvantage does static profiler has?
# What is the source for branch prediction in static profiler
# Static profiler used
# Dynamic profiler benefits are
# What is required for dynamic profiling
# Dynamic profiler differs from static
# Dynamic data is useful when
# Dynamic memory allocation is bad for
# Dynamic memory allocation
# Choose the correct statement(s)
# Choose the correct statement(s)
# Choose the correct statement(s)
# Register allocation includes
# Interference graph is built
# What is corresponding entity for the interference graph colors?
# How data dependencies are used in the code generation?
# What instruction scheduling is useful for?
# How does instruction planning performed?
# How structure field reordering could affect the application performance?
# What are the aims for the structure splitting?
# Why is the pointer chasing useful?
# When the linked list is stored inside the memory
# How dynamic linked list memory placement can be improved?
# Linked list worse than array for
# Array is better than linked list for