Static for loop

I write templated short vector and small classes that are not limited to 2-3-4 elements, but can have an arbitrary number of elements.

template <typename T, size_t N>
class ShortVector
{
public:

    ...

    template <size_t I> T& get() { return m_data[I]; }
    template <size_t I> const T& get() const { return m_data[I]; }

private:

    T m_data[N];
};

      

I want the access interface to be static, so I can specialize the class to use inline vector registers for the supported class sizes. (Let it be AVX, C ++ AMP, or OpenCL vectors.) The problem is that writing ALL the desired operators for this class (unary, +, -, *, /, dots, lengths, ...) requires a lot of recursion template and I haven't even gotten an implementation of matrix-vector and matrix-matrix multiplication where I would need nested recursion.

Right now I have non-member friends operators and a private member class with various static functions like

template <size_t I, typename T1, typename T2> struct Helpers
{
    static void add(ShortVector& dst, const ShortVector<T1, N>& lhs, const ShortVector<T2, N>& rhs)
    {
        dst.get<I>() = lhs.get<I>() + rhs.get<I>();
        Helpers<I - 1, T1, T2>::add(dst, lhs, rhs);
    }

    ...
};
template <typename T1, typename T2> struct Helpers < 0, T1, T2 >
{
    static void add(ShortVector& dst, const ShortVector<T1, N>& lhs, const ShortVector<T2, N>& rhs)
    {
        dst.get<0>() = lhs.get<0>() + rhs.get<0>();
    }

    ...
};

      

Writing static functions and specializations like this for all operators is just plain wrong. Writing more complex operations this way is very error prone. I am looking for something like

static_for< /*Whatever needed to define something like a run-time for cycle*/, template <size_t I, typename... Args> class Functor>();

      

Or pretty much everything that allowed me to skip most of this pattern. I started writing a class like this, but I couldn't compile it with reasonable specialization. I feel like I still lack the skills to write such a class (or function). I've looked at other libraries like Boost MPL but haven't completely decided to use it. I also looked at std :: index_sequence which might also be helpful.

While std :: index_sequence seems to be the most portable solution, it has a major drawback that I'm reluctant to look at. Ultimately these classes need to be SYCL compatible, that is, I'm limited to using C ++ 11, including template metaprogramming techniques. std :: integer_sequence is an addition to the C ++ 14 STL library, and while this locale limitation only matters in terms of language features, nothing prevents an STL developer from using C ++ 14 language features when implementing the C ++ 14 STL so using C ++ 14 STL functions may not be portable.

I am open to suggestions or even solutions.

EDIT

Here is what I came to. This is the header of the Metaprogramming template tricks that I started putting together and the for loop will be next in line. The helper needs a functor that has a work index as the first parameter and accepts various predicates. It will keep the functor instance as long as the predicate for the next iteration remains true. It would be possible for indexed indexing to be done by any number, multiplied by a number, etc.

+3


source to share


3 answers


How about this:

template <size_t I, typename Functor, typename = std::make_index_sequence<I>>
struct Apply;

template <size_t I, typename Functor, std::size_t... Indices>
struct Apply<I, Functor, std::index_sequence<Indices...>> :
    private std::tuple<Functor> // For EBO with functors
{
    Apply(Functor f) :  std::tuple<Functor>(f) {}
    Apply() = default;

    template <typename InputRange1, typename InputRange2, typename OutputRange>
    void operator()(OutputRange& dst,
                    const InputRange1& lhs, const InputRange2& rhs) const
    {
        (void)std::initializer_list<int>
        { (dst.get<Indices>() = std::get<0>(*this)(lhs.get<Indices>(),
                                                   rhs.get<Indices>()), 0)... };
    }
};

      

Use can be

Apply<4,std::plus<>>()(dest, lhs, rhs); // Size or functor type 
                                        // can be deduced if desired

      



A (slightly modified) example: Demo .

You can also remove the functor state if it gets in your way in any way:

template <size_t I, typename Functor, typename = std::make_index_sequence<I>>
struct Apply;

template <size_t I, typename Functor, std::size_t... Indices>
struct Apply<I, Functor, std::index_sequence<Indices...>>
{
    template <typename InputRange1, typename InputRange2, typename OutputRange>
    void operator()(OutputRange& dst,
                    const InputRange1& lhs, const InputRange2& rhs) const
    {
        (void)std::initializer_list<int>
        { (dst.get<Indices>() = Functor()(lhs.get<Indices>(),
                                          rhs.get<Indices>()), 0)... };
    }
};

      

+2


source


You can take a look at the Boost Fusion algorithms.

All it takes is to adapt your type as a Fusion sequence.

Simple example: Live On Coliru



#include <boost/array.hpp>
#include <boost/fusion/adapted.hpp>
#include <boost/fusion/algorithm.hpp>
#include <boost/fusion/include/io.hpp>
#include <iostream>

int main()
{
    using namespace boost;

    boost::array<int, 4> iv4 { 1,2,3,4 };
    boost::array<double, 4> id4 { .1, .2, .3, .4 };

    auto r = fusion::transform(iv4, id4, [](auto a, auto b) { return a+b; });
    std::cout << r;
}

      

Printing

(1.1 2.2 3.3 4.4)

      

+5


source


Relatively

" I am limited to using C ++ 11, including template metaprogramming techniques. std::integer_sequence

- C ++ 14 STL library addition [& hellip;]

& hellip; you can do this for example. g ++ compiler:

namespace my {
    using std::tuple;
    using std::tuple_cat;

    template< int i >
    struct Number_as_type_ {};

    template< int... values >
    using Int_sequence_ = tuple< Number_as_type_<values>... >;

    template< class Int_seq_a, class Int_seq_b >
    using Concat_ = decltype( tuple_cat( Int_seq_a(), Int_seq_b() ) );

    template< int max_index >
    struct Index_sequence_t_
    {
        using T = Concat_<
            typename Index_sequence_t_<max_index-1>::T, Int_sequence_<max_index>
            >;
    };

    template<>
    struct Index_sequence_t_<0> { using T = Int_sequence_<0>; };

    template< int n_indices >
    using Index_sequence_ = typename Index_sequence_t_<n_indices - 1>::T;
}  // namespace my

      


Unfortunately Visual C ++ 12.0 (2013) disables template argument deduction for the above Int_sequence_

. This appears to be due to the mistaken view of a template using

as a view that automatically references a local typedef

in the class. Anyway, working with this understanding of the Visual C ++ compiler error , I rewrote the above, which seems to work well with Visual C ++ too:

   Version that works better with Visual C ++ 12.0
namespace my {
    using std::tuple;
    using std::tuple_cat;

    template< int i >
    struct Number_as_type_ {};

    template< int... values >
    struct Int_sequence_
    {
        using As_tuple = tuple< Number_as_type_<values>... >;
    };

    template< int... values >
    auto int_seq_from( tuple< Number_as_type_<values>... > )
        -> Int_sequence_< values... >;

    template< class Int_seq_a, class Int_seq_b >
    using Concat_ = decltype(
        int_seq_from( tuple_cat(
            typename Int_seq_a::As_tuple(), typename Int_seq_b::As_tuple()
            ) )
        );

    template< int n_indices >
    struct Index_sequence_t_
    {
        using T = Concat_<
            typename Index_sequence_t_<n_indices-1>::T, Int_sequence_<n_indices-1>
            >;
    };

    template<>
    struct Index_sequence_t_<1> { using T = Int_sequence_<0>; };

    template< int n_indices >
    using Index_sequence_ = typename Index_sequence_t_<n_indices>::T;
}  // namespace my

      


With the above C ++ 11-based support, general compile-time for

loop indexing,
or template-based expansion if required, can be implemented in C ++ 11, so code like this could be written:

template< int i >
struct Add_
{
    void operator()( int sum[], int const a[], int const b[] ) const
    {
        sum[i] = a[i] + b[i];
    }
};

#include <iostream>
using namespace std;

auto main() -> int
{
    int sum[5];
    int const a[] = {1, 2, 3, 4, 5};
    int const b[] = {100, 200, 300, 400, 500};

    my::for_each_index<5, Add_>( sum, a, b );

    for( int x: sum ) { cout << x << ' '; } cout << endl;
}

      

Note, however, that while this may seem like the best after pizza, I suspect that any reasonably good compiler will of course do the loop optimization, i.e. that there is no need for any benefit to gain from introducing this bit of additional complexity.

As always with optimization, do MEASURE .


In this design, a complete loop is unrolled, i.e. instead of n executions of the loop body with a variable index, you get n instances of the loop body with different index values. This is not necessarily the best approach, for example. as more code is less likely to set the cache (repeat: always measure for optimization) and for parallelism you may have special requirements. You can check "Duff's device" for a technique for a more limited cycle reversal.

namespace my {
    using std::forward;
    using std::initializer_list;

    template< class Type >
    void evaluate( initializer_list< Type > const& ) {}

    namespace impl {
        template< template <int> class Functor_, class... Args >
        struct Call_with_numbers_
        {
            template<  int... numbers >
            void operator()( Int_sequence_<numbers...> const&, Args&&... args ) const
            {
                evaluate( {(Functor_<numbers>()( args... ), 0)...} );
            }
        };
    }  // namespace impl

    template< int n, template<int> class Functor_, class... Args >
    void for_each_index( Args&&... args )
    {
        using Seq = Index_sequence_<n>;
        Seq s;
        impl::Call_with_numbers_< Functor_, Args... >()( s, forward<Args>( args )... );
    }
}  // namespace my

      

Disclaimer: coded late at night so not necessarily very perfect!: - /

+1


source







All Articles