1. Trang chủ
  2. » Tất cả

Taxonomies in software engineering a systematic mapping study and a revised taxonomy development method

17 0 0
Tài liệu được quét OCR, nội dung có thể không chính xác
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Taxonomies in Software Engineering: A Systematic Mapping Study and a Revised Taxonomy Development Method
Tác giả Muhammad Usman, Ricardo Britto, Jürgen Borstler, Emilia Mendes
Trường học Blekinge Institute of Technology
Chuyên ngành Software Engineering
Thể loại research paper
Năm xuất bản 2017
Thành phố Karlskrona
Định dạng
Số trang 17
Dung lượng 2,01 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Software Engineering SE is a comprehensive and diverse knowledge field that embraces a myriad of different research subareas.. / Information and Software Technology 0 0 0 2017 1–17 to con

Trang 1

ARTICLE IN PRESS

Information and Software Technology 0 0 0 (2017) 1–17

ContentslistsavailableatScienceDirect

journalhomepage:www.elsevier.com/locate/infsof

Muhammad Usmana ,∗, Ricardo Brittoa , Jürgen Börstlera , Emilia Mendesb

a Department of Software Engineering (DIPT), Blekinge Institute of Technology (BTH), Karlskrona, 371 79, Sweden

b Department of Computer Science and Engineering (DIDD), Blekinge Institute of Technology (BTH), Karlskrona, 371 79, Sweden

a r t i c l e i n f o

Article history:

Received 29 September 2015

Revised 13 January 2017

Accepted 14 January 2017

Available online xxx

Keywords:

Taxonomy

Classification

Software engineering

Systematic mapping study

a b s t r a c t

Context:SoftwareEngineering(SE)isanevolvingdisciplinewithnewsubareasbeingcontinuously de-velopedandadded.TostructureandbetterunderstandtheSEbodyofknowledge,taxonomieshavebeen proposedinallSEknowledgeareas

Objective:Theobjectiveofthispaperistocharacterizethestate-of-the-artresearchonSEtaxonomies

Method:Asystematicmappingstudywasconducted,basedon270primarystudies

Results:AnincreasingnumberofSEtaxonomieshave beenpublishedsince 2000 inabroadrangeof venues,includingthetopSEjournalsandconferences.Themajorityoftaxonomiescanbegroupedinto thefollowingSWEBOK knowledgeareas:construction (19.55%),design (19.55%),requirements(15.50%) andmaintenance(11.81%).Illustration(45.76%)isthemostfrequentlyusedapproachfortaxonomy vali-dation.Hierarchy(53.14%)andfacetedanalysis(39.48%)arethemostfrequentlyusedclassification struc-tures.Mosttaxonomiesrelyonqualitative procedurestoclassifysubjectmatterinstances,butinmost cases(86.53%) theseprocedures arenot describedinsufficientdetail The majorityofthetaxonomies (97%)targetuniquesubjectmattersandmanytaxonomy-papersarecitedfrequently.MostSEtaxonomies aredesignedinanad-hocway.Toaddressthisissue,wehaverevisedanexistingmethodfordeveloping taxonomiesinamoresystematicway

Conclusion:ThereisastronginterestintaxonomiesinSE,butfewtaxonomiesareextendedorrevised Taxonomydesigndecisionsregardingtheusedclassificationstructures,proceduresanddescriptivebases areusuallynotwelldescribedandmotivated

© 2017PublishedbyElsevierB.V

1 Introduction

In science and engineering, a systematic description and

organization of the investigated subjects helps to advance the

knowledge in this field [1] This organization can be achieved

through the classification of the existing knowledge Knowledge

classificationhassupportedthematurationofdifferentknowledge

fieldsmainlyinfourways:

• Classification of the objects of a knowledge field provides a

common terminology, which eases the sharing of knowledge

[1–3]

• Classificationcanprovideabetterunderstandingofthe

interre-lationshipsbetweentheobjectsofaknowledgefield[1]

∗ Corresponding author

E-mail addresses: muhammad.usman@bth.se (M Usman), ricardo.britto@bth.se

(R Britto), jurgen.borstler@bth.se (J Börstler), emilia.mendes@bth.se (E Mendes)

• Classification can help to identify gaps in a knowledge field [1–3]

• Classificationcansupportdecisionmakingprocesses[1] Summarizing, classification can support researchers and prac-titionersingeneralizing,communicatingandapplyingthefindings

ofaknowledgefield[4] Software Engineering (SE) is a comprehensive and diverse knowledge field that embraces a myriad of different research subareas The knowledge within many subareas is already clas-sified, in particular by means of taxonomies [5–9] According to the Oxford English Dictionary [10] , a taxonomy is “a scheme of classification” A taxonomy allows for the description of terms and their relationships in the context of a knowledge area The conceptoftaxonomywasoriginallyproposedby CarolusLinnaeus [11] to group andclassify organisms by usinga fixed number of hierarchical levels Nowadays, different classification structures (e.g hierarchy, tree and faceted analysis [12] ) have been used http://dx.doi.org/10.1016/j.infsof.2017.01.006

0950-5849/© 2017 Published by Elsevier B.V

Please citethisarticleas:M Usmanetal.,Taxonomiesinsoftwareengineering:ASystematicmappingstudyanda revisedtaxonomy

Trang 2

2 M Usman et al / Information and Software Technology 0 0 0 (2017) 1–17

to construct taxonomies in different knowledge fields, such as

Education[13] ,Psychology[14] andComputerScience[15]

TaxonomieshavecontributedtomaturetheSEknowledgefield

Nevertheless,likewisethetaxonomyproposedbyCarolusLinnaeus

that keeps being extended [16] , SE taxonomies are expected to

evolve over time incorporating new knowledge In addition, due

to the wide spectrum of SE knowledge, there is still a need to

classifytheknowledgeinmanySEsubareas

Although manySE taxonomies havebeen proposed inthe

lit-erature,itappearsthattaxonomieshavebeendesignedorevolved

without following particular patterns, guidelines or processes A

betterunderstandingofhow taxonomieshavebeendesignedand

applied in SE could be very useful for the development of new

taxonomiesandtheevolutionofexistingones

To the best of our knowledge, no systematic mapping or

systematic literature review has been conducted to identify and

analyzethestate-of-the-artoftaxonomiesinSE.Inthispaper,we

describeasystematicmappingstudy[17,18] aimingtocharacterize

thestate-of-the-artresearchonSEtaxonomies

The main contribution of this paper is a characterization of

the state-of-the-art of taxonomies in SE Our results also show

that most taxonomies are developed in an ad-hoc way We

therefore revised a taxonomy development method in the light

of the findings of this mapping study, our own experience and

literaturefromotherresearchfieldswithmorematurityregarding

taxonomies(e.g.,psychologyandcomputerscience)

The remainder of this paper is organized as follows:

Section 2 describes related background Section 3 presents the

employed research methodology The current state-of-the-art on

taxonomiesin SE, aswell asthe validity threats associated with

the mapping study,are presented in Section 4 In Section 5 , we

presentarevisedmethodfordevelopingSEtaxonomies,alongwith

an illustration of the revised method and its limitations Finally,

ourconclusionsandviewonfutureworkareprovidedinSection 6

2 Background

In this section, we discuss important aspects related to

tax-onomydesignthat serveasmotivationfortheresearch questions

describedinSection 3

Taxonomy is neither a trivial nor a commonly used term

According to the most cited English dictionaries, a taxonomy is

mainlyaclassificationmechanism:

naming and organizing things, especially plants and animals, into

groups that share similar qualities”.

natural relationships”.

of something, especially organisms” or “A scheme of classification”.

Since taxonomy is mainly defined as a classification system,

one of the main purposes to develop a taxonomy should be to

classifysomething

2.2 Subject matter

The first step in the design of a new taxonomy is to clearly

definetheunitsofclassification.Insoftwareengineeringthiscould

1 www.dictionary.cambridge.org

2 www.merriam-webster.com

3 www.oxforddictionaries.com

berequirements,designpatterns,architecturalviews,methodsand techniques,defectsetc.Thisrequiresathoroughunderstandingof thesubjectmatter tobe abletodefine cleartaxonomyclassesor categoriesthatarecommonlyacceptedwithinthefield[19,20]

2.3 Descriptive bases / terminology

Once thesubject matter is clearlydefined oran existing def-inition is adopted, the descriptive terms, which can be used to describe and differentiate subjectmatter instances, must also be specified.Anappropriatedescriptionofthisbasesforclassification

is important to perform the comparison of subject matter in-stances.Descriptivebasescanalsobeviewedasasetofattributes that can be used for the classification of the subject matter instances[19,20]

2.4 Classification procedure

Classification procedures define how subject matter instances (e.g.,defects) are systematically assigned to classesorcategories Taxonomy’s purpose, descriptive bases and classification proce-duresare related anddependent oneach other.Depending upon the measurement system used, the classification procedure can

bequalitativeorquantitative. Qualitative classificationprocedures are based on nominalscales In the qualitative classification sys-tems,therelationship betweenthe classescannotbe determined

Quantitative classification procedures, on the other hand, are basedonnumericalscales[20]

2.5 Classification structure

As aforementioned, a taxonomy is mainly a classification mechanism According to Rowley andFarrow [21] there are two main approaches to classification: enumerative and faceted In enumerative classification all classes are fixed, making a classifi-cationscheme intuitiveandeasy toapply.It is,however,difficult

to enumerate all classes in immature or evolving domains In faceted classification aspects of classes are described that can

be combined and extended Kwasnik [12] describes four main approaches to structure a classification scheme (classification structures):hierarchy,tree,paradigmandfacetedanalysis

Hierarchy [12] leads to taxonomies with a single top class that “includes” all sub- and sub-sub classes, i.e a hierarchical relationship with inheritance (“is-a” relationship) Consider, for example, the hierarchyof students in an institutionwherein the top class “student” has two sub-classes of “graduate student” and “undergraduate student” These sub-classes can further have sub-sub classes and so forth A true hierarchy ensures the mu-tual exclusivity property, i.e an entity can only belong to one class Mutualexclusivitymakes hierarchies easy torepresentand understand; however, it cannot represent multiple inheritance relationships though Hierarchy is also not suitable in situations when researchers have to include multiple and diverse criteria for differentiation To define a hierarchical classification, it is mandatory to have good knowledge on the subjectmatter to be classified; the classesand differentiating criteria betweenclasses mustbewelldefinedearlyon

Tree [12] is similar to the hierarchy, however, there is no inheritance relationship between the classes of tree-based tax-onomies.Inthiskindofclassificationstructure,commontypesof relationshipsbetweenclassesare “part-whole”,“cause-effect” and

“process-product” For example,a treerepresenting a whole-part relationship betweenacountry, itsprovinces andcities.Treeand hierarchysharesimilarstrengthsandlimitations

Paradigm [12] leads totaxonomies withtwo-wayhierarchical relationships between classes The classes are described by a

Trang 3

M Usman et al / Information and Software Technology 0 0 0 (2017) 1–17 3

Fig 1 Employed systematic mapping process

Table 1

Faceted analysis example

Tool name Platform(s) License type SE area Web support

Tool 2 Windows Proprietary Construction No

combination of two attributes at a time For example, paradigm

would be suitable if we have to also represent gender in the

“student” hierarchy example above-mentioned It can also be

viewedasatwo-dimensionalmatrixwhoseverticalandhorizontal

axesallowfortheinclusionoftwoattributesofinterest.Thistype

of classificationstructure shares similar strengths andlimitations

withthehierarchystructure

Faceted analysis [12,22] leads to taxonomies whose subject

matters are classified using multiple perspectives (facets) The

basicprincipleinfacetedanalysisisthattherearemorethanone

perspectivestoviewandclassifyacomplexentity.Eachfacetis

in-dependentandcanhaveitsownclasses,whichenablefacet-based

taxonomiestobeeasily adaptedsotheycanevolvesmoothlyover

time.InordertoproperlyclassifyCASEtools,forexample,multiple

facetsneedto beconsidered.Thesefacetsmayincludesupported

platform(s),licensetype,SEarea,websupportetc.Table 1 depicts

theapplicationofthesemultiplefacetstoclassifytwohypothetical

CASEtools.Facetedanalysisissuitablefornewandevolvingfields,

since it is not required to have the complete knowledge related

to the selected subjectmatter to design a facet-based taxonomy

However, it can be challenging to define an initial set of facets

Inaddition,although itispossibleto definerelationship between

facets, in most cases the facets are independent and have no

meaningfulrelationshipbetweeneachother

2.6 Validation

Validationstrengthensreliabilityandusefulnessoftaxonomies

Taxonomiescanbevalidatedinthreeways:

• Orthogonality demonstration – The orthogonalityofthe

tax-onomydimensionsandcategoriesisdemonstrated[8,20]

• Benchmarking – Thetaxonomyiscomparedtosimilar

classifi-cationschemes[8]

• Utility demonstration – Theutility of a taxonomyis

demon-strated by actually classifyingsubject matter examples [8,20]

The utility of a taxonomy can be demonstrated or

exempli-fied by classifying existing literature orexpert opinion, or by

employingmorerigorousvalidationapproachessuch asa case

studyorexperiment

3 Research methodology

We chose the systematic mapping study method (SMS) to

identify and analyze the state-of-the-art towards taxonomies in

SE,becausethismethodworkswellforbroadandweaklydefined

researchareas[17,18] WeemployedtheguidelinesbyKitchenham

and Charters [17] and partly implemented the mapping process

providedbyPetersenet al.[18] Theemployedmappingprocessis

summarizedinFig 1 anddescribedfurtherinSubsections 3.1 –3.5

3.1 Research questions

Thefollowingresearchquestionswereformulatedtoguidethis SMS:

• Question 1 (RQ1) – Whattaxonomy definitionsandpurposes areprovidedbypublicationsonSEtaxonomies?

• Question 2 (RQ2) – WhichsubjectmattersareclassifiedinSE taxonomies?

• Question 3 (RQ3) – HowistheutilityofSEtaxonomies demon-strated?

• Question 4 (RQ4) – HowareSEtaxonomiesstructured?

• Question 5 (RQ5) – TowhatextentareSEtaxonomiesused?

• Question 6 (RQ6) – HowareSEtaxonomiesdeveloped? ThemainideabehindRQ1istoidentifyhowandwhytheterm

“taxonomy” isusedinprimarystudiesthatclaimtopresenta tax-onomy RQ2 focuseson identifying the subjectmatters classified

bymeansoftaxonomiesinSE.RQ3focusesonidentifyingthe ap-proachesusedtodemonstratetheutilityofSEtaxonomies,which

isoneofthewaysofvalidatingataxonomy(seeSection 2 ).With RQ4 we intend to identify the classification structures, related descriptivebasesandclassificationproceduresemployedtodesign

SE taxonomies RQ5 focuses on the extent to which proposed

SE taxonomies are used Finally, RQ6 addresses in which ways

SE taxonomies are developed, i.e whether there are guidelines, methods,andprocessesthatguidethedevelopmentoftaxonomies

inasystematicway

3.2 Search process

ThesearchprocessemployedinthisworkisdisplayedinFig 2 andhas6activities

First,wedefinedthetermstobeincludedinoursearchstring

We selected all SWEBOK knowledge areas [7] to be included as terms, except for the three knowledge areas on related disci-plines (Computing Foundations, Mathematical Foundations and Engineering Foundations) We also included the term “Software Engineering”, to augment the comprehensiveness of the search string.Finally,toreduce the scopeof thesearch stringto studies thatreportSEtaxonomies,weincludedtheterm“taxonomy” SincesomeoftheknowledgeareasarereferredbytheSE com-munitythroughofotherterms(synonyms),wealsoincludedtheir synonyms.Specifically,thefollowingsynonymswereincludedinto thesearchstring:

• Requirements – requirementsengineering

• Construction – softwaredevelopment

• Design – softwarearchitecture

• Management – software project management, software man-agement

• Process – softwareprocess,softwarelifecycle

• Models and methods – softwaremodel,softwaremethods

• Economics – softwareeconomics

TheselectedSWEBOKknowledgeareasandtheterm“Software Engineering” were all linked using the operator OR The term

“taxonomy” was linked with the other terms using the operator AND.Thefinalsearchstringisshownbelow

Trang 4

4 M Usman et al / Information and Software Technology 0 0 0 (2017) 1–17

Fig 2 Search process

Table 2

Summary of search results

(“software requirements” OR “requirements engineering” OR “software design”

OR “software architecture” OR “software construction” OR “software

development” OR “software testing’ OR “software maintenance” OR “software

configuration management” OR “software engineering management” OR

“software project management” OR “software management” OR “software

engineering process” OR “software process” OR “software life cycle” OR

“software engineering models and methods” OR “software model” OR “software

methods” OR “software quality” OR “software engineering professional practice”

OR “software engineering economics” OR “software economics” OR “software

engineering”) AND (taxonomy OR taxonomies)

AlthoughSEknowledgeclassificationcouldbenamedin

differ-entways,e.g., taxonomy,ontology,classificationandclassification

scheme [1] , we limited the scope of this paper to taxonomies

Extendingour search string to include the terms “ontology” and

“classificationscheme” wouldhaveledtoan excessivenumberof

search results that would havebeen infeasible to handle4.Using

alternativetermswouldalsoforcetheauthorstointerpretwhether

theprimarystudies’authors’actuallyintendedtopresenta

taxon-omywhenthey donotexplicitlyreferto taxonomies.Tomitigate

thisthreattovalidity,werestrictedthescopetotaxonomies

Once thesearch stringwasdesigned, weselected theprimary

sourcestosearchforrelevantstudies.Scopus5,Compendex/Inspec6

and Web of Science7 were selected because they cover most of

theimportantSEdatabases,suchasIEEE,Springer,ACMand

Else-vier Inaddition, theselected primary sources are ableto handle

advanced queries The search string was applied on meta data

(i.e title, abstract and author keywords) in August 2014 on the

selecteddata sources We later on updated thesearch resultsby

applyingthesearchstringagaininFebruary2016,tofetchstudies

publishedbetweenSeptember 2014 andDecember 2015 Table 2

presentsthenumberofsearchresultsforeachdatasource

4 Inclusion of the terms “ontolog ∗ ” and “classification” returned 10,474 hits in to-

tal just for Scopus

5 www.scopus.com

6 www.engineeringvillage.com

7 apps.webofknowledge.com

3.3 Study selection process

The selection process employed in this work is displayed in Fig 3 anddetailedasfollows

First, the following inclusion and exclusion criteria were defined:

• Inclusion criteria

1.StudiesthatproposeorextendataxonomyAND

2.StudiesthatarewithinSoftwareEngineering(SE),according

toSWEBOK’sKAs(seeSubsection 3.2 )

• Exclusion criteria

1.Studieswherethefull-textisnotaccessibleOR;

2.StudiesthatdonotproposeorextendaSEtaxonomyOR;

3.StudiesthatarenotwritteninEnglishOR;

4.Studiesthat arenotreportedinapeer-reviewedworkshop, conference,orjournal

The selection of primary studies wasconducted using a two-stage screening procedure In the first stage, only the abstracts andtitlesofthestudieswereconsidered.Inthesecondstage,the fulltextswereread.Notethatweusedinbothstagesaninclusive approachtoavoidprematureexclusionofstudies,i.e.iftherewas doubtaboutastudy,suchastudywastobeincluded

Forthefirststage(level-1screening),thetotalnumberof1517 studies were equallydivided betweenthe two first authors.As a result,507studieswerejudgedaspotentiallyrelevant

To increase the reliability of the level-1 screening result, the third author screened a random sample of 10.30% (78 studies) fromthestudiesscreenedbythefirstauthorandthefourthauthor screenedarandomsampleof10.28%(78studies)fromthestudies screenedbythesecondauthor.Thefirstandthirdauthorshadthe samejudgmentfor91%(71)ofthestudies.Thesecondandfourth authorshadthesamejudgmentfor93.6%(73)ofthestudies

Toevaluatethereliability oftheinter-rateagreementbetween the authors, we calculated the Cohen’s kappa coefficient [23] The Cohen’s kappa coefficient between the first and third au-thors was statistically significant (significance level = 0.05) and equalto0.801.The Cohen’skappacoefficient betweenthesecond and fourth authors was also statistically significant (significance level = 0.05) and equal to 0.857 According to Fleiss et al. [23] , Cohen’s kappa coefficient values above 0.75mean excellent level

ofagreement

The level-2 screening (second stage), performed by the first andsecondauthors,consistedonapplyingtheselectioncriteriaon the full-textof the studies selected during thelevel-1 screening

Trang 5

M Usman et al / Information and Software Technology 0 0 0 (2017) 1–17 5

Fig 3 Selection process

Table 3

Rationale for excluded studies

Not proposing or evolving a SE taxonomy 1167

Total included after study selection 280

Total included after data extraction 270

Thetotalnumberof507studieswereequallydividedbetweenthe

firsttwoauthors.Asaresult,280studieswerejudgedasrelevant

To increase the reliability ofthe level-2screening, a two-step

validationwasperformed,asfollows:

1 The first author screened 27.67% (70) of the studies deemed

as relevant by the second author during the level-2

screen-ing(randomlyselected)andvice-versa.Nodisagreementswere

foundbetweentheauthors

2 Ninestudieswererandomlyselectedfromeachofthetwosets

allocated to the first two authors for further validation The

third author applied the study selection process on these 18

studies (about 6.43% of 280) for validation purposes No

dis-agreementswerefoundwithrespecttothestudyselection(i.e

include/exclude)decisions

During the entire screening process (stages 1 and 2), we

trackedthereasonforeachexclusion,aspresentedinTable 3

3.4 Extraction process

The extraction process employed in this work is summarized

in Fig 4 andconsists of four main steps: Define a classification

scheme, define an extraction form, extract data, andvalidate the

extracteddata

WedesignedclassificationschemebyfollowingPetersenet al.’s

guidelines[18] Ithasthefollowingfacets:

• Research type – Thisfacetisused todistinguishbetween

dif-ferenttypesofstudies(adaptedfromWieringaetal.[24] )

im-plementedinpractice,i.e.evaluationinarealenvironment,

ingeneralbymeansofthecasestudymethod

wasnot implementedinpracticeyet,although itwas vali-datedinlaboratoryenvironment,ingeneralbymeansof ex-periment

Solution proposal– Astudythatreportsataxonomythatwas neitherimplementedinpracticenorvalidatedalthoughitis supportedbyasmallexample(illustration)oragoodlineof argumentation

thathasnotypeofevaluation,validationorillustration

• SE knowledge area – Thisfacetisusedtodistinguishbetween the SE knowledge areas in which taxonomies havebeen pro-posed The categories of this facet follow the SWEBOK [7] : software requirements,softwaredesign,softwareconstruction, softwaretesting, softwaremaintenance, softwareconfiguration management, software engineeringmanagement, software en-gineering process, software engineering models and methods, softwarequality,softwareengineeringprofessionalpracticeand softwareengineeringeconomics

• Presentation approach – Thisfacetisusedtoclassifythe stud-ies accordingtotheoverallapproachused topresenta taxon-omy:textualandgraphical,respectively

Forthedataextraction,therelevantstudies(280)wereequally dividedbetweenthefirstandsecondauthors.Foreachpaper,data wascollectedandlateronstoredinaspreadsheet usingthedata extractionformshowninTable 4

To increase the reliability of the extracted data, a two-step validationwasperformed,asfollows:

1.Thefirstauthorindependentlyre-extractedthedataof50%(70)

of the studies originally extractedby thesecond author (ran-domlyselected) andvice-versa.Fivedisagreementswere iden-tified and all of them were relatedto the item “classification structure”

2.Eighteenstudieswererandomlyselectedfromthestudies orig-inallyextractedbythefirstandsecondauthors(9studiesfrom eachauthor).Thosestudieswereindependentlyre-extractedby the third author.Twenty threedisagreements were identified;

2 on the“taxonomy purpose”,10 on “classification structure”,

Trang 6

6 M Usman et al / Information and Software Technology 0 0 0 (2017) 1–17

Fig 4 Extraction process

Table 4

Data extraction form

Data item(s) Description

Citation data Title, author(s), year and publication venue

Taxonomy definition Definition of taxonomy that is used or referred to

Purpose Text that states the purpose for the taxonomy

Purpose keyword Key word used in the paper to describe the

purpose (e.g classify, understand, describe) Subject matter The name of the thing/concept are that is

taxonomized Descriptive bases Is the subject matter defined in sufficient

detail/clarity to enable classification (Yes/No) Classification structure Hierarchy, tree, paradigm, or faceted analysis,

according to Kwasnik [12]

Classification procedure The criteria for putting items in different classes

(qualitative, quantitative or no details provided) Classification procedure

description

Do the authors explicitly describe the classification procedure (Yes/No)

Design method Did the authors employ any systematic approach to

design the reported taxonomy? If so, which approach?

Presentation approach Textual or graphical

Utility demonstration Is the utility of the taxonomy demonstrated? If so,

how (e.g illustration, case study, experiment)?

Primary knowledge area Primary knowledge area as per SWEBOK v3 [7]

Secondary knowledge area Secondary knowledge area as per SWEBOK v3 (if

applicable) Number of citations Number of times a primary study is cited by other

studies, as per Google Scholar

2on“classificationproceduretype”, 3on“classification

proce-duredescription” and6on“validationapproach”

All disagreements except for “classification structure” were

easily resolved We believe that the high level of disagreement

on the item “classification structure” was due to the fact that

noneof thestudies explicitlystatedandmotivatedtheemployed

classification structure, which demanded the inference of such

datafromthetextineachpaper

Toimprovethereliability oftheextracteddata,we decidedto

re-screenall 280papers,focusing onlyonthe item“classification

structure” First, we discussed classification structures in detail

(basedon Kwasnik[12] ) to cometo a commonunderstanding of

theterms.Second, three ofusdidan independent re-assessment

oftheclassificationstructureof52papers.Asaresult,wereached

fullagreementon50papers(3identicalresults)andpartial

agree-menton2 papers(2 reviewersagreeing).There were noprimary

Fig 5 Analysis process

studies without full or partial agreement Third, the remaining

228studieswere re-assessedby thefirstandsecondauthorsand they reachedagreement on216 papers.The remaining 12 papers were independentlyre-assessed by thethird author,whodid not know the results from the other two reviewers In the end, full agreementwasachievedfor50studiesandpartialagreementwas achievedfor230studies

During the re-assessment of the primary studies, 10 studies were excluded becausethey donotpresenttaxonomies,reducing thefinalnumberofprimarystudiesto2708(seeTable 3 )

3.5 Analysis process

Fig 5 presents the analysis process conducted herein First,

we classified the extracted data using the scheme defined in Subsection 3.4 This led to the results detailedin Section 4 We also performed a quantitative analysis of the extracted data to answer the research questions of this paper Finally, the overall resultofthe dataanalysis(see Section 4 ),along withinformation

8 The full list of the included 270 primary studies is available at http://tinyurl com/jrdaxhh

Trang 7

M Usman et al / Information and Software Technology 0 0 0 (2017) 1–17 7

Fig 6 Year and venue wise distributions

fromadditionalliterature([12,19,20] ),wasusedtorevisean

exist-ing methodpreviouslyproposed todesign SEtaxonomies[25] ,as

detailedinSection 5

4 Results

In this section, we describe the results of the mapping study

reported herein, which are based on the data extracted from

270 papers reporting 271 taxonomies (one paper presented two

taxonomies) The percentages in Sections 4.1 and 4.7 reflect the

number of papers (270), whereas the percentages in all other

subsectionsreflectthenumberoftaxonomies(271)

4.1 General results

Fig 6 shows that SE taxonomies have been proposed since

1987,withanincreasingnumberofthesepublishedaftertheyear

2000,whichsuggestsahigherinterestinthisresearchtopic

Table 5 displaysthat53.7%(145)ofthestudieswerepublished

inrelevantconferencesinthetopicsofmaintenance(International

Conference on Software Maintenance), requirements engineering

(Requirements’Engineering Conference) or generalSE topics(e.g

International Conference on Software Engineering) Taxonomies

were published at99 unique conferences with78 featuring only

a singleSE taxonomypublication.Theseresultsfurther indicatea

broadinterestinSEtaxonomiesina widerangeofSEknowledge

areas

Table 5 also shows that 33.7% (91) of the primary studies

were published as journal articles in 44 unique journals

Tax-onomies have been published frequently in relevant SE journals

(e.g IEEE Transactions on Software Engineering and Information

andSoftwareTechnology).We believethatthishasbeenthecase

because the scope of thesejournals is not confined to a specific

SEknowledgearea

Primary studies were published also in 28 unique workshops

(34 – 12.6%).Asforjournalsandconferences, theresultsindicate

Table 5

Publication venues with more than two taxonomy papers

IEEE Intl Conference on Software Maintenance (ICSM) 10 Intl Conference on Requirements Engineering (RE) 6 Intl Conference on Software Engineering (ICSE) 5 Hawaii Intl Conference on Systems Sciences (HICSS) 4 Asia Pacific Software Engineering Conference (APSEC) 4 European Conference on Software Maintenance and Reengineering

(CSMR)

4 Intl Conference on Software Engineering and Knowledge Engineering

Intl Symposium on Empirical Software Engineering and Measurement (ESEM)

4 Americas Conference on Information Systems (AMCIS) 3

IEEE Transactions on Software Engineering (TSE) 11 Information and Software Technology (IST) 9

Journal of Software: Evolution and Process 5

an increasing interest in SE taxonomies in a broad range of SE knowledgeareas

Fig 7 a–h depict the yearly distribution of SE taxonomies by knowledge area for the KAs with 10 or more taxonomies Note thatmostknowledgeareasfollowan increasingtrendafter2000, withmanytaxonomiesforconstruction,design,andqualityinthe 1980sand1990s

Trang 8

8 M Usman et al / Information and Software Technology 0 0 0 (2017) 1–17

Fig 7 Yearly distribution of primary studies by KAs Horizontal axes represent the years (starting 1987), while vertical axes denote the number of taxonomies

4.2 Classification scheme results

In this section, we present the results corresponding to the

threefacetsoftheclassificationschemedescribedinSection 3 ,i.e

SEknowledgearea(KA),researchtypeandpresentationapproach

The vertical axis in Fig 8 depictsthe SE knowledge areas in

which taxonomies have been proposed Construction and design

are the leading SE knowledge areas each with 53 (19.55%)

tax-onomies.These are relatively matureSE fields witha large body

ofknowledgeandahighnumberofsubareas

A high number of taxonomies have also been proposed in

the requirements (42 – 15.50%), maintenance (32 – 11.81%) and

testing(27 – 9.96%)knowledgeareas.Few taxonomieshave been

proposed ineconomics(3 – 1.11%)and professionalpractice (3 –

1.11%),whicharemorerecentknowledgeareas

The results show that most SE taxonomies (76.37%) are

pro-posed in the requirements, design, construction, testing and

maintenance knowledge areas, which correspond to the main

activitiesinatypicalsoftwaredevelopmentprocess[26]

The horizontal axis in Fig 8 shows the distribution of

tax-onomies by research types , according to Wieringa et al [24]

Most taxonomies are reported in papers that are classified as

“solutionproposals” (135– 49.82%),whereintheauthorspropose

ataxonomyandexplainorapplyitwiththehelpofanillustration

Ninety one taxonomies (33.58%) are reported in “philosophical papers”,whereinauthorsproposea taxonomy,butdonotprovide anykind ofvalidation, evaluationor illustration Relatively fewer taxonomiesarereportedin“evaluationpapers” (34 – 12.54%)and

“validationpapers” (11– 4.06%)

Fig 8 also depicts the classification of the taxonomies using

2aspects oftheclassificationscheme,i.e.SEknowledge areaand researchtype

Taxonomies in the knowledge areas construction and design are mostly reported either as solution proposals (construction – 27;design– 31) or philosophicalpapers (construction – 20;design – 17).Taxonomies in the knowledge areas requirements, mainte-nance andtestingare better distributed across differentresearch types,whereinbesidesthesolutionproposalandthephilosophical researchtypes,areasonablepercentageoftaxonomiesarereported

asevaluationorvalidationpapers

The horizontal axis in Fig 9 shows the distribution of tax-onomiesby presentation approach .Mosttaxonomies(57.93%)are presentedpurelyastextortable,while42.07%ofthetaxonomies are presented through some graphical notation in combination withtext

Fig 9 also displays the classification of the identified tax-onomiesintermsofSEknowledgeareaandpresentationapproach Theresultsshow2differenttrends:

Trang 9

M Usman et al / Information and Software Technology 0 0 0 (2017) 1–17 9

Fig 8 Systematic map – knowledge area vs research type

• Forknowledgeareassuchasdesign,quality,modelsand

meth-ods, and process, both textual and graphical approaches are

usedan almost equalnumberoftimes.This suggeststhat the

taxonomiesintheKAsthatinvolvealotofmodelingmightbe

betterpresentedusinggraphicalmodelingapproaches

• Mosttaxonomies in construction (35 out of 53), maintenance

(23outof32),testing(15outof27)andsoftwaremanagement

(7outof10)aretextuallypresented

We extracteddata aboutthe followingtwo aspects to answer

RQ1:

• Taxonomy definition : We investigated from each study

whetherornottheauthorsmadeanyattempttocommunicate

theirunderstandingabouttheconceptoftaxonomybycitingor

presentinganydefinitionofit

• Taxonomy purpose :Weidentified fromeachstudythestated

(ifany)mainpurposefordesigningataxonomy

Fig 9 Systematic map – knowledge area vs presentation type

Asstatedearliertaxonomyisnot atrivialconcept.Ithasbeen defined in multiple ways (see Section 2 for some definitions) ThisRQaims to identifywhetherauthorsmake an expliciteffort

to sharetheir perspective on taxonomyby adopting/usinga def-inition The results show that only 6.3% (17) of the taxonomies were reported with a definition for the term “taxonomy” Out

of these 17 taxonomies, three use the Cambridge dictionary’s definition(seeSection 2 ), eightstudiesdonotprovidean explicit source and the remaining six have other unique references: The American heritage dictionary9, Carl Linnaeus [11] , Whatis10, the IEEEstandardtaxonomyforSEstandards,DotyandGlick[27] and

9 www.ahdictionary.com/

10 www.whatis.com

Trang 10

10 M Usman et al / Information and Software Technology 0 0 0 (2017) 1–17

Table 6

Approaches for utility demonstration

Fleishmanet al. [28] For the remaining 93.7% (254) taxonomies,

nodefinitionof“taxonomy” wasprovided

To identify the purpose of each taxonomy, we extracted the

relevant text, referred here as purpose descriptions, from each

of the primary studies, using a process similar to open coding

[29,30] As codes we used the keywords used in the primary

studiestodescribeataxonomy’spurpose

For about 56% of the taxonomies, the authors used “classify”

(48.80%) or “categorize” (7.74%) to describe the purpose of their

taxonomy For 5.9% of the taxonomies it was not possible to

identifyaspecificpurpose.Fortheremainingtaxonomies(38.37%),

we found 41 different terms for describing the purpose, e.g.,

“identify”,“understand”,and“describe”

4.4 RQ2 – Subject matters

Intotal,weidentified263uniquesubjectmatters11 forthe271

taxonomies,e.g., technicaldebt,architecturalconstraints,usability

requirements,testingtechniquesandprocessmodels

Thehighnumberofuniquesubjectmattersmeansthatalmost

eachtaxonomydealtwitha uniquesubjectmatter.Thismightbe

duetothefollowingreasons:

• Existing taxonomies do not fit their purpose well Therefore

thereisaneedtodefinenewtaxonomies

• Thesubjectmattersforexistingtaxonomiesaresonarrowly

de-finedthatthey arenotsuitableforusageoutsidetheiroriginal

context.Newtaxonomiesarethereforedevelopedconstantly

• SE researchers do not reuse or extend existing taxonomies

when there is need for organizing SE knowledge, but rather

proposenewones

One indicator for taxonomy use is the numberof times each

primarystudyiscited.ThisanalysisisdetailedinSubsection 4.7

Thelistofsubjectmatterscontainsmainlytechnicalaspects of

SE.Onlyfewtaxonomiesdealwithpeople-relatedsubjectmatters,

e.g.,stakeholder-relatedandprivacy-relatedissues

The results forthisresearch questionsuggest thattaxonomies

arerarelyrevisited,revisedorextended.However,manytaxonomy

papersarehighlycited,whichshowsthatthereisastronginterest

intaxonomiesintheSEfield

ThemappingofSEtaxonomiespresentedinthispapersupports

SE researchers in identifying and evolving existing taxonomies

Thismayleadto thedevelopmentofamoreconsistent

terminol-ogy

4.5 RQ3 – Utility demonstration

Table 6 displaystheapproachesusedtodemonstratetheutility

ofSEtaxonomies.Illustrationisthemostfrequentlyusedapproach

(124 – 45.76%).Illustration includesapproaches such asexample,

scenarioandcase

11 For full list see: http://tinyurl.com/z4mqfnr

Table 7

Classification structure

Table 8

Descriptive bases

Table 9

Classification procedure types

Table 10

Classification procedure descriptions

Casestudieshavealsobeenusedto demonstratetheutilityof

34 taxonomies (12.54%) Experimentshave been usedto demon-strate the utility of 11 taxonomies (4.06%), while the utility of

a few taxonomies have also been demonstrated through expert opinion(7 – 2.58%)orsurvey(4– 1.48%) Notethat 33.9%(83)of thetaxonomiesdidnothavetheirutilitydemonstrated

The results related to RQ3 show that a few taxonomies have their utility demonstrated through methods like case study or experiment, while the utility of a large number of taxonomies (33.58%) is not demonstrated by any means We do not believe that one particular approach would be the best for all contexts; howeverwebelievethat inmostcaseswouldnot beenoughjust

toproposeataxonomy

To answer RQ4, the following data was gathered: classifi-cation structure, descriptive bases, classification procedure and classificationproceduredescription

Table 7 shows the classification structures identified for the identified taxonomies Hierarchy was the most frequently used classification structure (144– 53.14%), followed by faceted-based structures (107 – 39.48%), tree (14 – 5.17%) and paradigm (6 – 2.21%)

Table 8 displays the status of the taxonomies’ descriptive basis The majority of the taxonomies have a sufficiently clear description of their elements(248– 91.51%), followed by only 22 taxonomies(8.49%)withoutasufficientdescription

Table 9 presents the classification procedure types for the identified taxonomies The majorityof the taxonomies employed

a qualitative classification procedure (262 – 96.68%), followed by quantitative(7– 2.58%)andboth(2– 0.74%)

Table 10 displaysthe status ofthe taxonomies’ classification

procedure description The majority of the taxonomies do not have an explicit description forthe classification procedure (227

Ngày đăng: 19/03/2023, 15:16

Nguồn tham khảo

Tài liệu tham khảo Loại Chi tiết
[1] S. Vegas, N. Juristo, V. Basili, Maturing software engineering knowledge through classifications: a case study on unit testing techniques, Softw. Eng.IEEE Trans. 35 (4) (2009) 551-565 Khác
[2] I. Vessey, V. Ramesh, R.L. Glass, A unified classification system for research in the computing disciplines, Inf. Softw. Technol. 47 (4) (2005) 245-255 Khác
[3] C. Wohlin, Writing for synthesis of evidence in empirical software engineer- ing, in: Proceedings of the 8th ACM/IEEE International Symposium on Empiri- cal Software Engineering and Measurement (ESEM), ACM, New York, NY, USA, 2014, pp. 46:1-46:4 Khác
[4] I. Vessey, V. Ramesh, R.L. Glass, A unified classification system for research in the computing disciplines, Inf Softw. Technol. 47 (4) (2005) 245-255 Khác
[5] IEEE, IEEE Standard Taxonomy for Software Engineering Standards, Technical Report, IEEE Std 1002-1987, IEEE, 1987 Khác
[6] IEEE, Systems and software engineering System life cycle processes, Techni- cal Report, ISO/IEC 15288:2008(E) IEEE Std 15288-2008 (Revision of IEEE Std 15288-2004), IEEE, 2008 Khác
[7] P. Bourque, R.E. Farley (Eds.), Guide to the software engineering body of knowl- edge v3, IEEE Comput. Soc., 2013 Khác
[8] D. Smite, C. Wohlin, Z. Galvina, R. Prikladnicki, An empirically based terminol- ogy and taxonomy for global software engineering, Empirical Softw. Eng. 19(1) (2014) 105-153 Khác
[9] M. Unterkalmsteiner, R. Feldt, T. Gorschek, A taxonomy for requirements engi- neering and software test alignment, ACM Trans. Softw. Eng. Methodol. 23 (2) (2014) 16:1-16:38 Khác
[11] C. Linnaeus, System of nature through the three kingdoms of nature, according to classes, orders, genera and species, with characters, differences, synonyms, places (in Latin), 10th, Laurentius Salvius, 1758 Khác
[12] B.H. Kwasnik, The role of classification in knowledge representation and dis- covery, Lib. Trends 48 (1) (1999) 22-47 Khác
[13] B.S. Bloom, Taxonomy of Educational Objectives. Vol. 1: Cognitive Domain, McKay, 1956 Khác
[14] T.E. Moffitt, Adolescence-limited and life-course-persistent antisocial behavior: a developmental taxonomy., Psychol. Rev. 100 (4) (1993) 674-701 Khác
[15] D. Scharstein, R. Szeliski, A taxonomy and evaluation of dense two-frame stereo correspondence algorithms, Int. J. Comput. Vision 47 (1-3) (2002) 7-42 Khác
[17] B. Kitchenham, S. Charters, Guidelines for performing Systematic Literature Re- views in Software Engineering, Technical Report, Keele University, 2007 Khác
[18] K. Petersen, R. Feldt, S. Mujtaba, M. Mattsson, Systematic mapping studies in software engineering, in: Proceedings of the 12th International Conference on Evaluation and Assessment in Software Engineering, in: EASE’08, British Com- puter Society, Swinton, UK, UK, 2008, pp. 68-77 Khác
[19] R.L. Glass, I. Vessey, Contemporary application-domain taxonomies, IEEE Softw. 12 (4) (1995) 63-76 Khác
[20] G.R. Wheaton, Development of a taxonomy of human performance: A review of classificatory systems relating to tasks and performance, Technical Report, American Institute for Research, Washington DC, 1968 Khác
[22] R. Prieto-Diaz, Implementing faceted classification for software reuse, Com- mun, ACM 34 (5) (1991) 88-97 Khác
[23] J.L. Fleiss, B. Levin, M.C. Paik, Statistical Methods for Rates and Proportions, John Wiley & Sons, 2013 Khác

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN